rudals252/UTILITY_AI_ANNOTATION_TOOL

Fork 0

Files

rudals252 bc14484576 yolo, mmedet 라벨링 수정도구

2025-11-17 18:00:06 +09:00

12 KiB

Raw Permalink Blame History

COCO Annotation Utility Tool

COCO 형식의 어노테이션 파일을 자유롭게 조작할 수 있는 유틸리티 스크립트입니다.

주요 기능

대화형 모드 (interactive): 터미널에서 대화형으로 편집 (추천!)
클래스 제거 (remove): 특정 클래스 삭제 (ID 또는 이름으로 지정 가능)
클래스 통합 (merge): 여러 클래스를 하나로 병합 (ID 또는 이름으로 지정 가능)
클래스 이름 변경 (rename): 클래스 이름 수정 (ID 또는 이름으로 지정 가능)
클래스 ID 재할당 (reindex): ID를 순차적으로 재정렬
정보 확인 (info): 어노테이션 파일 정보 출력 (클래스별 개수 및 합계 표시)

설치

별도 설치 불필요. Python 3 기본 라이브러리만 사용합니다.

빠른 시작 - 대화형 모드 (권장)

인수 없이 실행하면 대화형 모드로 진입합니다:

python Utility_lableing_tool.py

대화형 모드에서는:

파일 선택: 자동으로 찾은 어노테이션 파일 목록에서 선택하거나 직접 경로 입력
작업 선택: 메뉴에서 원하는 작업을 선택하고 여러 작업을 순차적으로 수행 가능
저장: 모든 작업 완료 후 결과를 저장

대화형 모드 예시

============================================================
  COCO 어노테이션 편집기 - 대화형 모드
============================================================

[1/3] 어노테이션 파일 선택
------------------------------------------------------------

발견된 어노테이션 파일:
  1. dataset/annotations/instances_train.json
  2. dataset/annotations/instances_valid.json
  0. 직접 경로 입력

파일 번호 선택 (직접 입력은 0): 1

[+] 로딩 중: dataset/annotations/instances_train.json

============================================================
파일: dataset/annotations/instances_train.json
============================================================

전체 통계:
  - 이미지 수: 10
  - 어노테이션 수: 29
  - 클래스 수: 2

클래스 상세:
ID     클래스명                        어노테이션 수
-------------------------------------------------------
1      tt                            13
2      ss                            16
-------------------------------------------------------
합계                                  29
============================================================

[2/3] 작업 선택
------------------------------------------------------------

사용 가능한 작업:
  1. 클래스 제거
  2. 클래스 통합 (병합)
  3. 클래스 이름 변경
  4. 클래스 ID 재할당
  5. 현재 정보 표시
  0. 완료 (저장 단계로)

작업 선택 (0-5): 3

현재 클래스:
  - tt (ID: 1)
  - ss (ID: 2)

매핑 입력 (형식: 기존:새이름,기존:새이름 - 쉼표로 구분)
예시: tt:transformer,ss:substation 또는 1:transformer,2:substation

입력: tt:transformer,ss:substation

[*] 클래스 이름 변경 중
  - 변경: 'tt' → 'transformer' (ID: 1)
  - 변경: 'ss' → 'substation' (ID: 2)
[+] 2개 클래스 이름이 변경되었습니다

작업 선택 (0-5): 0

[3/3] 결과 저장
------------------------------------------------------------

수행된 작업:
  1. 이름 변경: 2개 클래스

입력 파일: dataset/annotations/instances_train.json

출력 파일 경로 입력 (기본값: dataset/annotations/instances_train_edited.json):

출력 파일이 존재하면 백업 생성? (Y/n): Y

[*] 저장 중: dataset/annotations/instances_train_edited.json...
[+] 저장 완료: dataset/annotations/instances_train_edited.json

============================================================
  작업이 성공적으로 완료되었습니다!
============================================================

입력:  dataset/annotations/instances_train.json
출력: dataset/annotations/instances_train_edited.json
(파일이 존재했다면 백업이 생성되었습니다)

CLI 모드 사용법

명령행 인수를 사용하여 스크립트 방식으로도 실행할 수 있습니다.

1. 정보 확인 (info)

어노테이션 파일의 전체 정보를 확인합니다.

python Utility_lableing_tool.py info --input dataset/annotations/instances_train.json

출력 예시:

============================================================
파일: dataset/annotations/instances_train.json
============================================================

전체 통계:
  - 이미지 수: 10
  - 어노테이션 수: 29
  - 클래스 수: 2

클래스 상세:
ID     클래스명                        어노테이션 수
-------------------------------------------------------
1      tt                            13
2      ss                            16
-------------------------------------------------------
합계                                  29
============================================================

2. 클래스 제거 (remove)

특정 클래스와 해당 어노테이션을 삭제합니다. 클래스 이름 또는 ID로 지정 가능합니다.

# 이름으로 클래스 제거
python Utility_lableing_tool.py remove \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_filtered.json \
  --classes "person,car,truck"

# ID로 클래스 제거
python Utility_lableing_tool.py remove \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_filtered.json \
  --classes "1,2,5"

# 이름과 ID 혼합 가능
python Utility_lableing_tool.py remove \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_filtered.json \
  --classes "1,person,5"

3. 클래스 통합 (merge)

여러 클래스를 하나로 병합합니다. 클래스 이름 또는 ID로 지정 가능합니다.

# 이름으로 클래스 통합
python Utility_lableing_tool.py merge \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_merged.json \
  --source "car,truck,bus,motorcycle" \
  --target "vehicle"

# ID로 클래스 통합
python Utility_lableing_tool.py merge \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_merged.json \
  --source "1,2,3,4" \
  --target "vehicle"

# 이름과 ID 혼합 가능
python Utility_lableing_tool.py merge \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_merged.json \
  --source "1,truck,3" \
  --target "vehicle"

4. 클래스 이름 변경 (rename)

클래스 이름을 변경합니다. 기존 클래스를 이름 또는 ID로 지정 가능합니다.

# 이름으로 클래스 이름 변경
python Utility_lableing_tool.py rename \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_renamed.json \
  --mapping "tt:transformer,ss:substation"

# ID로 클래스 이름 변경
python Utility_lableing_tool.py rename \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_renamed.json \
  --mapping "1:transformer,2:substation"

# 이름과 ID 혼합 가능
python Utility_lableing_tool.py rename \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_renamed.json \
  --mapping "1:transformer,ss:substation"

5. 클래스 ID 재할당 (reindex)

클래스 ID를 순차적으로 재할당합니다. 클래스를 삭제하거나 통합한 후 ID를 정리할 때 유용합니다.

# ID를 1부터 재할당
python Utility_lableing_tool.py reindex \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_reindexed.json

# ID를 0부터 재할당 (COCO는 보통 1부터 시작)
python Utility_lableing_tool.py reindex \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_reindexed.json \
  --start 0

고급 사용 예시

예시 1: 데이터셋 정리 워크플로우

# 1. 현재 상태 확인
python Utility_lableing_tool.py info --input instances_train.json

# 2. ID로 불필요한 클래스 제거
python Utility_lableing_tool.py remove \
  --input instances_train.json \
  --output instances_train_step1.json \
  --classes "5,7"

# 3. ID로 유사 클래스 통합
python Utility_lableing_tool.py merge \
  --input instances_train_step1.json \
  --output instances_train_step2.json \
  --source "1,2" \
  --target "vehicle"

# 4. 클래스 이름 정리
python Utility_lableing_tool.py rename \
  --input instances_train_step2.json \
  --output instances_train_step3.json \
  --mapping "tt:transformer,ss:substation"

# 5. ID 재할당
python Utility_lableing_tool.py reindex \
  --input instances_train_step3.json \
  --output instances_train_final.json

# 6. 최종 결과 확인
python Utility_lableing_tool.py info --input instances_train_final.json

예시 2: Train/Valid 데이터셋 동시 처리

# Train 데이터
python Utility_lableing_tool.py rename \
  --input dataset/annotations/instances_train.json \
  --output dataset/annotations/instances_train_new.json \
  --mapping "1:transformer,2:substation"

# Valid 데이터 (동일한 변경 적용)
python Utility_lableing_tool.py rename \
  --input dataset/annotations/instances_valid.json \
  --output dataset/annotations/instances_valid_new.json \
  --mapping "1:transformer,2:substation"

Python 스크립트에서 사용

유틸리티를 Python 코드에서 직접 사용할 수도 있습니다.

from Utility_lableing_tool import COCOAnnotationEditor

# 어노테이션 로드
editor = COCOAnnotationEditor('dataset/annotations/instances_train.json')

# 정보 확인
editor.print_info()

# 작업 수행 (이름 또는 ID로 지정 가능)
editor.remove_categories(['1', '2'])  # ID로 제거
editor.remove_categories(['person', 'car'])  # 이름으로 제거
editor.rename_categories({'1': 'transformer', 'ss': 'substation'})  # ID와 이름 혼합 가능
editor.reindex_categories(start_id=1)

# 저장
editor.save('dataset/annotations/instances_train_modified.json')

# 체이닝도 가능
editor = COCOAnnotationEditor('input.json')
editor.remove_categories(['1', '2']) \
      .merge_categories(['3', '4'], 'merged_class') \
      .reindex_categories() \
      .save('output.json')

옵션

백업 관련

기본적으로 출력 파일이 이미 존재하면 자동으로 백업이 생성됩니다.

# 백업 생성 (기본값)
python Utility_lableing_tool.py rename -i input.json -o output.json -m "old:new"

# 백업 생성 안 함
python Utility_lableing_tool.py rename -i input.json -o output.json -m "old:new" --no-backup

백업 파일명 형식: {original_name}_backup_{YYYYMMDD_HHMMSS}.json

주요 특징

ID와 이름 모두 지원

모든 작업(제거, 통합, 이름 변경)에서 클래스를 ID 또는 이름으로 지정 가능
숫자로 입력하면 자동으로 ID로 인식, 문자로 입력하면 이름으로 인식
ID와 이름을 혼합하여 사용 가능

어노테이션 통계

클래스별 어노테이션 개수 자동 계산
전체 어노테이션 합계 표시
실시간 정보 확인 가능

입력 오류 처리

잘못된 파일 경로 입력 시 재입력 요청
잘못된 선택 입력 시 재입력 요청
존재하지 않는 클래스 ID/이름 입력 시 경고 표시

주의사항

데이터 백업: 중요한 데이터는 항상 백업 후 작업하세요.
Train/Valid 일치: Train과 Valid 데이터셋의 클래스는 동일하게 유지해야 합니다.
ID 순서: reindex 명령은 기존 ID 순서를 기준으로 재할당합니다.
병합 ID: merge 명령은 가장 작은 ID를 새 카테고리 ID로 사용합니다.
원본 보호: 모든 작업은 새 파일로 저장되며 원본 파일은 수정되지 않습니다.

문제 해결

파일을 찾을 수 없음

# 절대 경로 사용
python Utility_lableing_tool.py info --input /full/path/to/annotations.json

# 또는 현재 디렉토리에서 상대 경로 사용
python Utility_lableing_tool.py info --input ./dataset/annotations/instances_train.json

JSON 형식 오류

COCO 형식 확인:

import json
with open('annotations.json') as f:
    data = json.load(f)
    print(data.keys())  # 'images', 'annotations', 'categories' 포함 확인

라이선스

MMDetection 프로젝트의 일부로 동일한 라이선스를 따릅니다.

12 KiB Raw Permalink Blame History