Data Analysis blog - [cropdoctor] 4. 작물 질병 데이터셋으로 커스텀 YOLO 모델 학습

🪴 [cropdoctor] 인공지능 기반 웹서비스 개발 프로젝트

4. 작물 질병 데이터셋으로 커스텀 YOLO 모델 학습

import os
import zipfile
import random 

from PIL import Image, ImageDraw
import json
import pickle
from tqdm import tqdm
import numpy as np 

import matplotlib.pyplot as plt
plt.rc("font", family="NanumGothic", size=13) 

import warnings
warnings.filterwarnings('ignore')

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from torch.utils.data.sampler import WeightedRandomSampler
import time
from tqdm import tqdm
import torchvision.transforms as transforms
from torchvision.datasets import ImageFolder
import os
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
from imblearn.over_sampling import SMOTE
import torchvision.datasets as datasets
from torchvision.transforms import transforms
import numpy as np

2023-05-10 13:15:09.260609: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-05-10 13:15:09.314742: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-10 13:15:10.776130: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

00. 데이터셋 폴더구조 설계

기존의 image classification pytorch 모델의 폴더구조는 아래와 같습니다. 클래스 별 폴더를 생성하고, ImageFolder와 DataLoader 함수를 사용하여 데이터셋을 불러왔습니다.

image classifiaction (pytorch) : 클래스 별 폴더 생성

data
  |_ training 
      |_ image_class 
          |_ 고추
          |_ 고추질병1 
          |_ ... 


  |_ vadlidation
      |_ image_class 
          |_ 고추
          |_ 고추질병1 
          |_ ...

yolo : 전체 이미지 모으고, 라벨데이터(라벨, 바운딩박스 정보) 따로 생성

datasets
    |_ training 
        |_ images
            |_ 사진이름.jpg
            |_ ...

        |_ labels
            |_ 사진이름.txt
            |_ ... 


    |_ validation 
        |_ images
            |_ 사진이름.jpg
            |_ ...

        |_ labels
            |_ 사진이름.txt
            |_ ...

하지만 yolo는 위와 같이 다른 폴더구조를 사용합니다. images/label 폴더를 각각 타로 생성합니다. 이미지 데이터를 클래스 별로 모으는 것이 아니라, 전체 이미지 데이터를 한 폴더에 모으고, label 폴더에는 이미지 파일과 동일한 이름의 txt 파일을 생성하고, txt 파일 안에 label 정보를 담습니다.

📕 reference
yolo 학습 ~ 예측 참고 링크

데이터셋 폴더 경로 설정

# 경로 설정  
path_train_img = "datasets/training/images/"
path_train_label = "datasets/training/labels/"

path_valid_img = "datasets/validation/images/"
path_valid_label = "datasets/validation/labels/"

위에서 설계한 대로 폴더를 생성하고, 그에 맞게 경로를 설정해줍니다.

01. 현재 annotation 데이터의 bbox를 yolo bbox 형식으로 변환하기

yolo bbox 형식 참고 링크

레이블 정보 딕셔너리 불러오기

# train
with open("data_preprocessing/dic_img2label_train_sampling.pickle","rb") as fr:
    dic_img2label_train_sampling = pickle.load(fr)
    
# validation
with open("data_preprocessing/dic_img2label_valid_sampling.pickle","rb") as fr:
    dic_img2label_valid_sampling = pickle.load(fr)

yolo 학습을 위해 기존의 bbox를 yolo bbox 형식으로 변환해야 합니다. 이를 위해 EDA에서 생성해 놓은 레이블 정보가 담겨져 있는 딕셔너리를 불러옵니다.

bbox 변환 함수

# 현재 bbox: (x_min, y_min, x_max, y_max) 좌상단, 우하단 꼭짓점 좌표 
# yolo bbox: (x_center, y_center, width, height)
def bbox_to_yolo_bbox(bbox, w, h): 
    # xmin, ymin, xmax, ymax
    if w < 10: 
        w, h = 3024, 3024 
        
    x_center = ((bbox['xbr'] + bbox['xtl']) / 2) / w
    y_center = ((bbox['ybr'] + bbox['ytl']) / 2) / h
    width = (bbox['xbr'] - bbox['xtl']) / w
    height = (bbox['ybr'] - bbox['ytl']) / h
    return [x_center, y_center, width, height]

train datasets

for img, values in tqdm(dic_img2label_train_sampling.items()):
    filename = os.path.splitext(img)[0]
    
    yolo_bbox = bbox_to_yolo_bbox(values["points"], values["size"][0], values["size"][1])
    bbox_str = " ".join([str(b) for b in yolo_bbox])
    label = values["disease"]
    result = f"{label} {bbox_str}"
       
    if result: 
        with open(os.path.join(path_train_label, f"{filename}.txt"), "w", encoding='utf-8') as f: 
            f.write(result)

100%|██████████| 13915/13915 [00:01<00:00, 12558.49it/s]

validation datasets

for img, values in tqdm(dic_img2label_valid_sampling.items()):
    filename = os.path.splitext(img)[0]
    
    yolo_bbox = bbox_to_yolo_bbox(values["points"], values["size"][0], values["size"][1])
    bbox_str = " ".join([str(b) for b in yolo_bbox])
    label = values["disease"]
    result = f"{label} {bbox_str}"
       
    if result: 
        with open(os.path.join(path_valid_label, f"{filename}.txt"), "w", encoding='utf-8') as f: 
            f.write(result)

100%|██████████| 7873/7873 [00:00<00:00, 15471.80it/s]

train, validation 데이터셋 모두 yolo bbox에 맞게 변환을 해주었고, datasets/training/labels, datasets/validation/labels 폴더에 이미지 별 txt 파일로 생성해주었습니다. 이 때 txt 파일의 내용은 label bbox정보 순서로 작성해줍니다.

02. 데이터셋 경로가 적힌 txt 파일 생성하기

# 경로에 들어있는 파일 리스트
lst_train_img = os.listdir(path_train_img)
lst_valid_img = os.listdir(path_valid_img)

lst_train_data = ["/home/elicer/" + path_train_img + img for img in lst_train_img]
lst_valid_data = ["/home/elicer/" + path_valid_img + img for img in lst_valid_img]

# train.txt
with open("train.txt", 'w') as f:
    f.write('\n'.join(lst_train_data) + '\n')

# valid.txt
with open("valid.txt", 'w') as f:
    f.write('\n'.join(lst_valid_data) + '\n')

전체 이미지 파일 경로가 담긴 txt 파일도 생성합니다.

03. yaml 파일 생성하기

import yaml

yaml_data = {
    "names": ["Pepper", "Pepper anthrax", "Pepper white powder bottle",
"Radish", "Radish Black-and-white disease", "Radish Bacteria-free disease",
"Cabbage", "Cabbage black rot disease", "Cabbage roe disease",
"Zucchini", "Zucchini nosocomial disease", "Zucchini white powder disease",
"Bean", "Bean fire disease", "Bean dot disease",
"Tomato", "Tomato leaf blight",
"Pumpkin", "Pumpkin roe disease", "Pumpkin white powder disease"],
    "nc":20, 
    "path": "/",
    "train": "./datasets/train.txt",
    "val": "./datasets/valid.txt",
}

with open("custom.yaml", "w") as f: 
    yaml.dump(yaml_data, f)

name에 클래스의 이름을 영문으로 적어 리스트로 넣어주고, nc에 클래스의 개수를 넣어줍니다. 그 다음 train, val에 앞서 생성한 txt 파일의 경로를 넣어주고 yaml파일로 저장합니다.

04. train 진행하기

학습을 위해 생성한 파일 정리
1) images 폴더 안에 전체 이미지 파일
2) labels 폴더 안에 이미지 파일에 대응되는 레이블 정보 txt 파일
3) 전체 이미지 파일의 경로가 모두 적힌 txt 파일
4) 클래스 이름, 개수, 3번에서 생성한 txt 파일 경로를 포함한 yaml 파일

아래의 명령어를 터미널 에서 실행

git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip install -qr requirements.txt

!python train.py --batch 64 --epochs 20 --data ../custom.yaml --device 0 --weights yolov5s.pt --name test

제가 수행한 학습 환경에서는 1에폭 당 1시간 30분 정도 걸렸습니다. 따라서 마찬가지로 tmux를 사용하여 학습을 진행하였습니다.

yolo 학습을 수행하면 자동으로 모델 평가 결과를 몇가지 제공해줍니다.

실제 vs 예측 Image 비교

label

pred

위의 도출된 결과 사진을 보면, pred의 bbox가 오히려 잎을 더 잘 예측한 것으로 보입니다.

Precision-Confidence Curve / Precision-Recall Curve / Recall-Confidence Curve 확인

tensorboard 평가지표 확인

20 에폭 동안 이상적으로 평가지표는 올라가고 loss값은 줄어들지만, validation의 obj_loss값은 증가하는 추이를 보입니다. 위의 3가지의 curve 그래프나, 이 tensorboard로 미루어 보아, 20 에폭으로는 학습이 덜 된 것 같다는 생각이 듭니다. epoch을 100 이상으로 설정하고 돌리면 더 괜찮은 결과를 보일 수 있을 것이라 예상됩니다.

이렇게 인공지능 기반의 웹서비스 개발 CropDoctor 프로젝트에서 모델 학습 부분을 마쳤습니다. 시간의 여유가 있었다면, 더 많은 모델 학습과 실험을 통해 결정하고 싶은 마음이 있었지만, image classification과 object detection 모델 학습을 비교 학습해봤다는 점에 의의를 두려고 합니다. 결론적으로 웹 서비스에서는 성능도 좋고, 아키텍처가 가벼운 Image Classification의 mobilenetV2 모델로 서빙하기로 결정하였습니다.