目标检测各种数据集进行互转

2023-10-15 15:25:34

一、VOC数据集和COCO数据集直接的相互转换

VOC数据集（xml格式）和COCO数据集（json格式）的相互转换
voc和coco数据集的目录结构：
以VOC2007数据集为例，下载下来有如下三文件夹：
在这里插入图片描述
Annotations文件夹是存放图片对应的xml文件，比如“2007_000027.xml"存放的是图片2007_000027.jpg对应的信息，用记事本打开可以看到，这是xml格式的数据。
ImageSets文件夹里存放了官方为我们划分好的训练集和验证集的txt文件。我们主要使用“ImageSets/Main/"文件夹下的train.txt和val.txt文件，train.txt文件存放了官方划分的训练集的图片名称，val.txt文件存放了验证集图片的名称。
还有一个需要关注的文件夹就是JEPGImages，里面存放了对应图片名称的原始图片。

<annotation><folder>文件夹目录</folder><filename>图片名.jpg</filename><path>path_to\at002eg001.jpg</path><source><database>Unknown</database></source><size><width>550</width><height>518</height><depth>3</depth></size><segmented>0</segmented><object><name>Apple</name><pose>Unspecified</pose><truncated>0</truncated><difficult>0</difficult><bndbox><xmin>292</xmin><ymin>218</ymin><xmax>410</xmax><ymax>331</ymax></bndbox></object><object>...</object>
</annotation>

可以看到一个xml文件包含如下信息：

folder: 文件夹
filename：文件名
path：路径
source：来源
size：图片大小
segmented：图像分割会用到，本文仅以目标检测（bounding box为例进行介绍）
object：一个xml文件可以有多个object，每个object表示一个box，每个box有如下信息组成：
name：改box框出来的object属于哪一类，例如Apple
bndbox：给出左上角和右下角的坐标
truncated:是否被截
difficult：是否为检测困难物体
不同于VOC，一张图片对应一个xml文件，coco是直接将所有图片以及对应的box信息写在了一个json文件里。通常整个coco目录长这样：

coco
|______annotations # 存放标注信息
|        |__train.json
|        |__val.json
|        |__test.json
|______trainset # 存放训练集图像
|______valset   # 存放验证集图像
|______testset  # 存放测试集图像

一个标准的json文件包含如下信息：

{ "info" : info,"licenses" : [license],"images" : [image],"annotations" : [annataton],"categories" : [category]
}

通过上面的json整体结构可以看出，info这个key对应的值的类型是一个字典；licenses、images、annotations和categories这四个key对应的值的类型都是一个列表，列表当中存储的数据类型依旧是字典。
我们可以通过len(List)的方式得到images、annotations、categories这三个列表的长度，也就得到了以下内容。

（1）images字段列表元素的长度 = 划入训练集（或者测试集）的图片的数量；
（2）annotations字段列表元素的数量 = 训练集（或者测试集）中bounding box的数量；
（3）categories字段列表元素的数量 = 类别的数量

接下来我们看每个key对应的内容：
（1）info

info{
"year" : int,                # 年份
"version" : str,             # 版本
"description" : str,         # 详细描述信息
"contributor" : str,         # 作者
"url" : str,                 # 协议链接
"date_created" : datetime,   # 生成日期
}

（2）images

"images": [                                            
{"id": 0,                                                # int 图像id，可从0开始"file_name": "0.jpg",                                   # str 文件名"width": 512,                                           # int 图像的宽"height": 512,                                          # int 图像的高"date_captured": "2020-04-14 01:45:07.508146",          # datatime 获取日期"license": 1,                                           # int 遵循哪个协议"coco_url": "",                                         # str coco图片链接url"flickr_url": ""                                        # str flick图片链接url
}]

（3）licenses

 "licenses": [
{"id": 1,                                            # int 协议id号      在images中遵循的license即1"name": null,                                       # str 协议名        "url": null                                         # str 协议链接      
}]

（4）annotations

"annotations": [ 
{"id": 0,                                   # int 图片中每个被标记物体的id编号"image_id": 0,                             # int 该物体所在图片的编号"category_id": 2,                          # int 被标记物体的类别id编号"iscrowd": 0,                              # 0 or 1 目标是否被遮盖，默认为0"area": 4095.9999999999986,                # float 被检测物体的面积（64 * 64 = 4096)"bbox": [200.0, 416.0, 64.0, 64.0],        # [x, y, width, height] 目标检测框的坐标信息"segmentation": [[200.0, 416.0, 264.0, 416.0, 264.0, 480.0, 200.0, 480.0]]  
}]

"bbox"里[x, y, width, height]x, y代表的是物体的左上角的x, y的坐标值。

"segmentation"里[x1, y1, x2, y2, x3, y3, x4, y4]是以左上角坐标为起始，顺时针依次选取的另外三个坐标点。及[左上x, 左上y, 右上x，右上y，右下x，右下y，左下x，左下y]。
（5）categories

"categories":[
{"id": 1,                                 # int 类别id编号"name": "rectangle",                     # str 类别名字"supercategory": "None"                  # str 类别所属的大类，如卡车和轿车都属于机动车这个class
}, 
{"id": 2,"name": "circle", "supercategory": "None"}
]

一、将voc数据集的xml转化为coco数据集的json格式

voc2coco

开始转换前，得先将要转化的所有.xml文件名保存在xml_list.txt列表中。如果是自己制作的voc数据集，在输入标签名的时候记得不要把类别名name打错了。

# create_xml_list.py
import os
xml_list = os.listdir('C:/Users/user/Desktop/train')
with open('C:/Users/user/Desktop/xml_list.txt','a') as f:for i in xml_list:if i[-3:]=='xml':f.write(str(i)+'\n')

执行python voc2coco.py xml_list.txt的文件路径 .xml文件的真实存放路径转化后的.json存放路径即可将xml转化为一个.json文件。

# voc2coco.py# pip install lxmlimport sys
import os
import json
import xml.etree.ElementTree as ETSTART_BOUNDING_BOX_ID = 1
PRE_DEFINE_CATEGORIES = {}
# If necessary, pre-define category and its id
#  PRE_DEFINE_CATEGORIES = {"aeroplane": 1, "bicycle": 2, "bird": 3, "boat": 4,#  "bottle":5, "bus": 6, "car": 7, "cat": 8, "chair": 9,#  "cow": 10, "diningtable": 11, "dog": 12, "horse": 13,#  "motorbike": 14, "person": 15, "pottedplant": 16,#  "sheep": 17, "sofa": 18, "train": 19, "tvmonitor": 20}def get(root, name):vars = root.findall(name)return varsdef get_and_check(root, name, length):vars = root.findall(name)if len(vars) == 0:raise NotImplementedError('Can not find %s in %s.'%(name, root.tag))if length > 0 and len(vars) != length:raise NotImplementedError('The size of %s is supposed to be %d, but is %d.'%(name, length, len(vars)))if length == 1:vars = vars[0]return varsdef get_filename_as_int(filename):try:filename = os.path.splitext(filename)[0]return int(filename)except:raise NotImplementedError('Filename %s is supposed to be an integer.'%(filename))def convert(xml_list, xml_dir, json_file):list_fp = open(xml_list, 'r')json_dict = {"images":[], "type": "instances", "annotations": [],"categories": []}categories = PRE_DEFINE_CATEGORIESbnd_id = START_BOUNDING_BOX_IDfor line in list_fp:line = line.strip()print("Processing %s"%(line))xml_f = os.path.join(xml_dir, line)tree = ET.parse(xml_f)root = tree.getroot()path = get(root, 'path')if len(path) == 1:filename = os.path.basename(path[0].text)elif len(path) == 0:filename = get_and_check(root, 'filename', 1).textelse:raise NotImplementedError('%d paths found in %s'%(len(path), line))## The filename must be a numberimage_id = get_filename_as_int(filename)size = get_and_check(root, 'size', 1)width = int(get_and_check(size, 'width', 1).text)height = int(get_and_check(size, 'height', 1).text)image = {'file_name': filename, 'height': height, 'width': width,'id':image_id}json_dict['images'].append(image)## Cruuently we do not support segmentation#  segmented = get_and_check(root, 'segmented', 1).text#  assert segmented == '0'for obj in get(root, 'object'):category = get_and_check(obj, 'name', 1).textif category not in categories:new_id = len(categories)categories[category] = new_idcategory_id = categories[category]bndbox = get_and_check(obj, 'bndbox', 1)xmin = int(get_and_check(bndbox, 'xmin', 1).text) - 1ymin = int(get_and_check(bndbox, 'ymin', 1).text) - 1xmax = int(get_and_check(bndbox, 'xmax', 1).text)ymax = int(get_and_check(bndbox, 'ymax', 1).text)############################################################
#如果报错ValueError: invalid literal for int() with base 10: '99.2'，原因是我们的坐标值是#浮点数字符串，而int只能转化整型字符串，这时坐标值得先用float将浮点数字符串转成浮点数，再用int将浮点#数转成整数。# xmin = int(float(get_and_check(bndbox, 'xmin', 1).text)) - 1
# ymin = int(float(get_and_check(bndbox, 'ymin', 1).text)) - 1
# xmax = int(float(get_and_check(bndbox, 'xmax', 1).text))
# ymax = int(float(get_and_check(bndbox, 'ymax', 1).text))############################################################assert(xmax > xmin)assert(ymax > ymin)o_width = abs(xmax - xmin)o_height = abs(ymax - ymin)ann = {'area': o_width*o_height, 'iscrowd': 0, 'image_id':image_id, 'bbox':[xmin, ymin, o_width, o_height],'category_id': category_id, 'id': bnd_id, 'ignore': 0,'segmentation': []}json_dict['annotations'].append(ann)bnd_id = bnd_id + 1for cate, cid in categories.items():cat = {'supercategory': 'none', 'id': cid, 'name': cate}json_dict['categories'].append(cat)json_fp = open(json_file, 'w')json_str = json.dumps(json_dict)json_fp.write(json_str)json_fp.close()list_fp.close()if __name__ == '__main__':if len(sys.argv) <= 1:print('3 auguments are need.')print('Usage: %s XML_LIST.txt XML_DIR OUTPU_JSON.json'%(sys.argv[0]))exit(1)convert(sys.argv[1], sys.argv[2], sys.argv[3])

注意这里的image_id用的是图片名称去掉.jpg，所以图片名必须是数字，如果不是，先将所有图片和label名称改成数字，再转coco。

import os
img_dir='F:/Billboard/dataset/images/'
lab_dir='F:/Billboard/dataset/labels/'
name_list = os.listdir(img_dir)
for i,name in enumerate(name_list):os.rename(img_dir+name,img_dir+str(i)+'.jpg')os.rename(lab_dir+name[:-4]+'.txt',lab_dir+str(i)+'.txt')

第二种方法，不需要繁琐的操作即可转换，只需要更改anno 以及xml_dir

import sys
import os
import json
import warnings
import numpy as np
import xml.etree.ElementTree as ET
import globSTART_BOUNDING_BOX_ID = 1
# 按照你给定的类别来生成你的 category_id
# COCO 默认 0 是背景类别
# CenterNet 里面类别是从0开始的，否则生成heatmap的时候报错
PRE_DEFINE_CATEGORIES = {'ignored regions': 1, 'pedestrian': 2,  'people': 3,'bicycle': 4, 'car': 5, 'van': 6, 'truck': 7,'tricycle': 8, 'awning-tricycle': 9, 'bus': 10,'motor': 11, 'others': 12}
START_IMAGE_ID = 0# If necessary, pre-define category and its id
#  PRE_DEFINE_CATEGORIES = {"aeroplane": 1, "bicycle": 2, "bird": 3, "boat": 4,
#  "bottle":5, "bus": 6, "car": 7, "cat": 8, "chair": 9,
#  "cow": 10, "diningtable": 11, "dog": 12, "horse": 13,
#  "motorbike": 14, "person": 15, "pottedplant": 16,
#  "sheep": 17, "sofa": 18, "train": 19, "tvmonitor": 20}def get(root, name):vars = root.findall(name)return varsdef get_and_check(root, name, length):vars = root.findall(name)if len(vars) == 0:raise ValueError("Can not find %s in %s." % (name, root.tag))if length > 0 and len(vars) != length:raise ValueError("The size of %s is supposed to be %d, but is %d."% (name, length, len(vars)))if length == 1:vars = vars[0]return varsdef get_filename_as_int(filename):try:filename = filename.replace("\\", "/")filename = os.path.splitext(os.path.basename(filename))[0]return int(filename)except:# raise ValueError("Filename %s is supposed to be an integer." % (filename))image_id = np.array([ord(char) % 10000 for char in filename], dtype=np.int32).sum()# print(image_id)return 0def get_categories(xml_files):"""Generate category name to id mapping from a list of xml files.Arguments:xml_files {list} -- A list of xml file paths.Returns:dict -- category name to id mapping."""classes_names = []for xml_file in xml_files:tree = ET.parse(xml_file)root = tree.getroot()for member in root.findall("object"):classes_names.append(member[0].text)classes_names = list(set(classes_names))classes_names.sort()return {name: i for i, name in enumerate(classes_names)}def convert(xml_files, json_file):json_dict = {"images": [], "type": "instances", "annotations": [], "categories": []}if PRE_DEFINE_CATEGORIES is not None:categories = PRE_DEFINE_CATEGORIESelse:categories = get_categories(xml_files)bnd_id = START_BOUNDING_BOX_IDimage_id = START_IMAGE_IDfor xml_file in xml_files:tree = ET.parse(xml_file)root = tree.getroot()path = get(root, "path")if len(path) == 1:filename = os.path.basename(path[0].text)elif len(path) == 0:filename = get_and_check(root, "filename", 1).textelse:raise ValueError("%d paths found in %s" % (len(path), xml_file))## The filename must be a number# image_id = get_filename_as_int(filename)size = get_and_check(root, "size", 1)width = int(get_and_check(size, "width", 1).text)height = int(get_and_check(size, "height", 1).text)if ".jpg" not in filename or ".png" not in filename:filename = filename + ".jpg"warnings.warn("filename's default suffix is jpg")images = {"file_name": filename,  # 图片名"height": height,"width": width,"id": image_id,  # 图片的ID编号（每张图片ID是唯一的）}json_dict["images"].append(images)## Currently we do not support segmentation.#  segmented = get_and_check(root, 'segmented', 1).text#  assert segmented == '0'for obj in get(root, "object"):category = get_and_check(obj, "name", 1).textif category not in categories:new_id = len(categories)categories[category] = new_idcategory_id = categories[category]bndbox = get_and_check(obj, "bndbox", 1)xmin = int(get_and_check(bndbox, "xmin", 1).text) - 1ymin = int(get_and_check(bndbox, "ymin", 1).text) - 1xmax = int(get_and_check(bndbox, "xmax", 1).text)ymax = int(get_and_check(bndbox, "ymax", 1).text)assert xmax > xminassert ymax > ymino_width = abs(xmax - xmin)o_height = abs(ymax - ymin)ann = {"area": o_width * o_height,"iscrowd": 0,"image_id": image_id,  # 对应的图片ID（与images中的ID对应）"bbox": [xmin, ymin, o_width, o_height],"category_id": category_id,"id": bnd_id, # 同一张图片可能对应多个 ann"ignore": 0,"segmentation": [],}json_dict["annotations"].append(ann)bnd_id = bnd_id + 1image_id += 1for cate, cid in categories.items():cat = {"supercategory": "none", "id": cid, "name": cate}json_dict["categories"].append(cat)os.makedirs(os.path.dirname(json_file), exist_ok=True)json.dump(json_dict, open(json_file, 'w'), indent=4)if __name__ == "__main__":# import argparse# parser = argparse.ArgumentParser(#     description="Convert Pascal VOC annotation to COCO format."# )# parser.add_argument("xml_dir", help="Directory path to xml files.", type=str)# parser.add_argument("json_file", help="Output COCO format json file.", type=str)# args = parser.parse_args()# args.xml_dir# args.json_filexml_dir = "./xml"json_file = "./train.json"  # output jsonxml_files = glob.glob(os.path.join(xml_dir, "*.xml"))# If you want to do train/test split, you can pass a subset of xml files to convert function.print("Number of xml files: {}".format(len(xml_files)))convert(xml_files, json_file)print("Success: {}".format(json_file))

此版本极其好用，可以切分训练集集验证集
注意修改

classes：自己的目标类别
xml_dir：图片与xml文件
img_dir: xml_dir的上级目录

#coding:utf-8# pip install lxmlimport os
import glob
import json
import shutil
import numpy as np
import xml.etree.ElementTree as ETpath2 = "./coco/" # 输出文件夹
# classes = ['plane', 'baseball-diamond', 'bridge', 'ground-track-field', 
# 'small-vehicle', 'large-vehicle', 'ship', 
# 'tennis-court', 'basketball-court',  
# 'storage-tank', 'soccer-ball-field', 
# 'roundabout', 'harbor', 
# 'swimming-pool', 'helicopter','container-crane',]  # 类别classes=['plastic_bag','carton','plastic_bottle','hydrophyte','deciduous_aggregates','plastic_cup','cans']
xml_dir = "Annotations/" # xml文件
img_dir = "/media/wntlab/39e84b7d-5985-43ce-a0fa-a7f312f85897/HJK/dataset/data_voc_2021.11.1/" # 图片
train_ratio = 0.85 # 训练集的比例START_BOUNDING_BOX_ID = 1def get(root, name):return root.findall(name)def get_and_check(root, name, length):vars = root.findall(name)if len(vars) == 0:raise NotImplementedError('Can not find %s in %s.'%(name, root.tag))if length > 0 and len(vars) != length:raise NotImplementedError('The size of %s is supposed to be %d, but is %d.'%(name, length, len(vars)))if length == 1:vars = vars[0]return varsdef convert(xml_list, json_file):json_dict = {"images": [], "type": "instances", "annotations": [], "categories": []}categories = pre_define_categories.copy()bnd_id = START_BOUNDING_BOX_IDall_categories = {}for index, line in enumerate(xml_list):# print("Processing %s"%(line))xml_f = linetree = ET.parse(xml_f)root = tree.getroot()filename = os.path.basename(xml_f)[:-4] + ".JPG"image_id = 20190000001 + indexsize = get_and_check(root, 'size', 1)width = int(get_and_check(size, 'width', 1).text)height = int(get_and_check(size, 'height', 1).text)image = {'file_name': filename, 'height': height, 'width': width, 'id':image_id}json_dict['images'].append(image)## Cruuently we do not support segmentation#  segmented = get_and_check(root, 'segmented', 1).text#  assert segmented == '0'for obj in get(root, 'object'):category = get_and_check(obj, 'name', 1).textif category in all_categories:all_categories[category] += 1else:all_categories[category] = 1if category not in categories:if only_care_pre_define_categories:continuenew_id = len(categories) + 1print("[warning] category '{}' not in 'pre_define_categories'({}), create new id: {} automatically".format(category, pre_define_categories, new_id))categories[category] = new_idcategory_id = categories[category]bndbox = get_and_check(obj, 'bndbox', 1)xmin = int(float(get_and_check(bndbox, 'xmin', 1).text))ymin = int(float(get_and_check(bndbox, 'ymin', 1).text))xmax = int(float(get_and_check(bndbox, 'xmax', 1).text))ymax = int(float(get_and_check(bndbox, 'ymax', 1).text))assert(xmax > xmin), "xmax <= xmin, {}".format(line)assert(ymax > ymin), "ymax <= ymin, {}".format(line)o_width = abs(xmax - xmin)o_height = abs(ymax - ymin)ann = {'area': o_width*o_height, 'iscrowd': 0, 'image_id':image_id, 'bbox':[xmin, ymin, o_width, o_height],'category_id': category_id, 'id': bnd_id, 'ignore': 0,'segmentation': []}json_dict['annotations'].append(ann)bnd_id = bnd_id + 1for cate, cid in categories.items():cat = {'supercategory': 'none', 'id': cid, 'name': cate}json_dict['categories'].append(cat)json_fp = open(json_file, 'w')json_str = json.dumps(json_dict)json_fp.write(json_str)json_fp.close()print("------------create {} done--------------".format(json_file))print("find {} categories: {} -->>> your pre_define_categories {}: {}".format(len(all_categories), all_categories.keys(), len(pre_define_categories), pre_define_categories.keys()))print("category: id --> {}".format(categories))print(categories.keys())print(categories.values())if __name__ == '__main__':pre_define_categories = {}for i, cls in enumerate(classes):pre_define_categories[cls] = i + 1# pre_define_categories = {'a1': 1, 'a3': 2, 'a6': 3, 'a9': 4, "a10": 5}only_care_pre_define_categories = True# only_care_pre_define_categories = Falseif os.path.exists(path2 + "/annotations"):shutil.rmtree(path2 + "/annotations")os.makedirs(path2 + "/annotations")if os.path.exists(path2 + "/train2017"):shutil.rmtree(path2 + "/train2017")os.makedirs(path2 + "/train2017")if os.path.exists(path2 + "/val2017"):shutil.rmtree(path2 +"/val2017")os.makedirs(path2 + "/val2017")save_json_train = path2 + 'annotations/instances_train2017.json'save_json_val = path2 + 'annotations/instances_val2017.json'xml_list = glob.glob(xml_dir + "/*.xml")xml_list = np.sort(xml_list)np.random.seed(100)np.random.shuffle(xml_list)train_num = int(len(xml_list)*train_ratio)xml_list_train = xml_list[:train_num]xml_list_val = xml_list[train_num:]convert(xml_list_train, save_json_train)convert(xml_list_val, save_json_val)f1 = open(path2 + "train.txt", "w")for xml in xml_list_train:img = img_dir + xml.split("\\")[-1][:-4] + ".JPG"f1.write(os.path.basename(xml)[:-4] + "\n")shutil.copyfile(img, path2 + "/train2017/" + os.path.basename(img))f2 = open(path2 + "test.txt", "w")for xml in xml_list_val:img = img_dir + xml.split("\\")[-1][:-4] + ".JPG"f2.write(os.path.basename(xml)[:-4] + "\n") shutil.copyfile(img, path2 + "/val2017/" + os.path.basename(img))f1.close()f2.close()print("-------------------------------")print("train number:", len(xml_list_train))print("val number:", len(xml_list_val))

二、将COCO格式的json文件转化为VOC格式的xml文件

如果是要将COCO格式的json文件转化为VOC格式的xml文件，将anno和xml_dir改成json文件路径和转化后的xml文件保存路径，执行下面代码即可完成转化。

# coco2voc.py# pip install pycocotools
import os
import time
import json
import pandas as pd
from tqdm import tqdm
from pycocotools.coco import COCO#json文件路径和用于存放xml文件的路径
anno = 'C:/Users/user/Desktop/val/instances_val2017.json'
xml_dir = 'C:/Users/user/Desktop/val/xml/'coco = COCO(anno)  # 读文件
cats = coco.loadCats(coco.getCatIds())  # 这里loadCats就是coco提供的接口，获取类别# Create anno dir
dttm = time.strftime("%Y%m%d%H%M%S", time.localtime())def trans_id(category_id):names = []namesid = []for i in range(0, len(cats)):names.append(cats[i]['name'])namesid.append(cats[i]['id'])index = namesid.index(category_id)return indexdef convert(anno,xml_dir): with open(anno, 'r') as load_f:f = json.load(load_f)imgs = f['images']  #json文件的img_id和图片对应关系 imgs列表表示多少张图cat = f['categories']df_cate = pd.DataFrame(f['categories'])                     # json中的类别df_cate_sort = df_cate.sort_values(["id"], ascending=True)  # 按照类别id排序categories = list(df_cate_sort['name'])                     # 获取所有类别名称print('categories = ', categories)df_anno = pd.DataFrame(f['annotations'])                    # json中的annotationfor i in tqdm(range(len(imgs))):  # 大循环是images所有图片,Tqdm是可扩展的Python进度条，可以在长循环中添加一个进度提示信息xml_content = []file_name = imgs[i]['file_name']    # 通过img_id找到图片的信息height = imgs[i]['height']img_id = imgs[i]['id']width = imgs[i]['width']version =['"1.0"','"utf-8"'] # xml文件添加属性xml_content.append(" + version[0] +" "+ "encoding="+ version[1] + "?>")xml_content.append("")xml_content.append("    " + file_name + "")xml_content.append("    ")xml_content.append("        " + str(width) + "")xml_content.append("        " + str(height) + "")xml_content.append("        "+ "3" + "")xml_content.append("    ")# 通过img_id找到annotationsannos = df_anno[df_anno["image_id"].isin([img_id])]  # (2,8)表示一张图有两个框for index, row in annos.iterrows():  # 一张图的所有annotation信息bbox = row["bbox"]category_id = row["category_id"]cate_name = categories[trans_id(category_id)]# add new objectxml_content.append("    ")xml_content.append("")x = xml_contentxml_content = [x[i] for i in range(0, len(x)) if x[i] != "\n"]### list存入文件#xml_path = os.path.join(xml_dir, file_name.replace('.xml', '.jpg'))xml_path = os.path.join(xml_dir, file_name.split('j')[0]+'xml')print(xml_path)with open(xml_path, 'w+', encoding="utf8") as f:f.write('\n'.join(xml_content))xml_content[:] = []if __name__ == '__main__':convert(anno,xml_dir)

三、VOC到YOLO

import xml.etree.ElementTree as ET
import os# box [xmin,ymin,xmax,ymax]
def convert(size, box):x_center = (box[2] + box[0]) / 2.0y_center = (box[3] + box[1]) / 2.0# 归一化x = x_center / size[0]y = y_center / size[1]# 求宽高并归一化w = (box[2] - box[0]) / size[0]h = (box[3] - box[1]) / size[1]return (x, y, w, h)def convert_annotation(xml_paths, yolo_paths, classes):xml_files = os.listdir(xml_paths)# 生成无序文件列表print(f'xml_files:{xml_files}')for file in xml_files:xml_file_path = os.path.join(xml_paths, file)yolo_txt_path = os.path.join(yolo_paths, file.split(".")[0]+ ".txt")tree = ET.parse(xml_file_path)root = tree.getroot()size = root.find("size")# 获取xml的width和height的值w = int(size.find("width").text)h = int(size.find("height").text)# object标签可能会存在多个，所以要迭代with open(yolo_txt_path, 'w') as f:for obj in root.iter("object"):difficult = obj.find("difficult").text# 种类类别cls = obj.find("name").textif cls not in classes or difficult == 1:continue# 转换成训练模式读取的标签cls_id = classes.index(cls)xml_box = obj.find("bndbox")box = (float(xml_box.find("xmin").text), float(xml_box.find("ymin").text),float(xml_box.find("xmax").text), float(xml_box.find("ymax").text))boxex = convert((w, h), box)# yolo标准格式类别 x_center,y_center,width,heightf.write(str(cls_id) + " " + " ".join([str(s) for s in boxex]) + '\n')if __name__ == "__main__":# 数据的类别classes_train = ['ignored regions', 'pedestrian', 'people','bicycle','car', 'van', 'truck','tricycle','awning-tricycle','bus','motor', 'others']# xml存储地址xml_dir = "./xml1/"# yolo存储地址yolo_txt_dir = "./Yolo_txt/"# voc转yoloconvert_annotation(xml_paths=xml_dir, yolo_paths=yolo_txt_dir,classes=classes_train)

在转换之前先要制定classes_train（训练集的类别），xml_dir（voc格式的路径）、yolo_txt_dir（yolo格式标注存储的路径）

四、yolo转voc

from xml.dom.minidom import Document
import os
import cv2# def makexml(txtPath, xmlPath, picPath):  # txt所在文件夹路径，xml文件保存路径，图片所在文件夹路径
def makexml(picPath, txtPath, xmlPath):  # txt所在文件夹路径，xml文件保存路径，图片所在文件夹路径"""此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件"""dic = {'0': "0",  # 创建字典用来对类型进行转换'1': "1",  # 此处的字典要与自己的classes.txt文件中的类对应，且顺序要一致}files = os.listdir(txtPath)print(files)for i, name in enumerate(files):xmlBuilder = Document()annotation = xmlBuilder.createElement("annotation")  # 创建annotation标签xmlBuilder.appendChild(annotation)txtFile = open(txtPath + name)
#        print(txtFile)txtList = txtFile.readlines()
#        print(txtList)img = cv2.imread(picPath + name[0:-4] + ".jpg")print(name[0:-4])Pheight, Pwidth, Pdepth = img.shapefolder = xmlBuilder.createElement("folder")  # folder标签foldercontent = xmlBuilder.createTextNode("driving_annotation_dataset")folder.appendChild(foldercontent)annotation.appendChild(folder)  # folder标签结束filename = xmlBuilder.createElement("filename")  # filename标签filenamecontent = xmlBuilder.createTextNode(name[0:-4] + ".jpg")filename.appendChild(filenamecontent)annotation.appendChild(filename)  # filename标签结束size = xmlBuilder.createElement("size")  # size标签width = xmlBuilder.createElement("width")  # size子标签widthwidthcontent = xmlBuilder.createTextNode(str(Pwidth))width.appendChild(widthcontent)size.appendChild(width)  # size子标签width结束height = xmlBuilder.createElement("height")  # size子标签heightheightcontent = xmlBuilder.createTextNode(str(Pheight))height.appendChild(heightcontent)size.appendChild(height)  # size子标签height结束depth = xmlBuilder.createElement("depth")  # size子标签depthdepthcontent = xmlBuilder.createTextNode(str(Pdepth))depth.appendChild(depthcontent)size.appendChild(depth)  # size子标签depth结束annotation.appendChild(size)  # size标签结束for j in txtList:oneline = j.strip().split(" ")object = xmlBuilder.createElement("object")  # object 标签picname = xmlBuilder.createElement("name")  # name标签namecontent = xmlBuilder.createTextNode(dic[oneline[0]])#           print(namecontent)picname.appendChild(namecontent)object.appendChild(picname)  # name标签结束pose = xmlBuilder.createElement("pose")  # pose标签posecontent = xmlBuilder.createTextNode("Unspecified")pose.appendChild(posecontent)object.appendChild(pose)  # pose标签结束truncated = xmlBuilder.createElement("truncated")  # truncated标签truncatedContent = xmlBuilder.createTextNode("0")truncated.appendChild(truncatedContent)object.appendChild(truncated)  # truncated标签结束difficult = xmlBuilder.createElement("difficult")  # difficult标签difficultcontent = xmlBuilder.createTextNode("0")difficult.appendChild(difficultcontent)object.appendChild(difficult)  # difficult标签结束bndbox = xmlBuilder.createElement("bndbox")  # bndbox标签xmin = xmlBuilder.createElement("xmin")  # xmin标签mathData = int(((float(oneline[1])) * Pwidth + 1) - (float(oneline[3])) * 0.5 * Pwidth)xminContent = xmlBuilder.createTextNode(str(mathData))xmin.appendChild(xminContent)bndbox.appendChild(xmin)  # xmin标签结束ymin = xmlBuilder.createElement("ymin")  # ymin标签mathData = int(((float(oneline[2])) * Pheight + 1) - (float(oneline[4])) * 0.5 * Pheight)yminContent = xmlBuilder.createTextNode(str(mathData))ymin.appendChild(yminContent)bndbox.appendChild(ymin)  # ymin标签结束xmax = xmlBuilder.createElement("xmax")  # xmax标签mathData = int(((float(oneline[1])) * Pwidth + 1) + (float(oneline[3])) * 0.5 * Pwidth)xmaxContent = xmlBuilder.createTextNode(str(mathData))xmax.appendChild(xmaxContent)bndbox.appendChild(xmax)  # xmax标签结束ymax = xmlBuilder.createElement("ymax")  # ymax标签mathData = int(((float(oneline[2])) * Pheight + 1) + (float(oneline[4])) * 0.5 * Pheight)ymaxContent = xmlBuilder.createTextNode(str(mathData))ymax.appendChild(ymaxContent)bndbox.appendChild(ymax)  # ymax标签结束object.appendChild(bndbox)  # bndbox标签结束annotation.appendChild(object)  # object标签结束f = open(xmlPath + name[0:-4] + ".xml", 'w')xmlBuilder.writexml(f, indent='\t', newl='\n', addindent='\t', encoding='utf-8')f.close()if __name__ == "__main__":picPath = "model/YOLOX/datasets/VOC/VOCdevkit/VOC2007/JPEGImages/"  # 图片所在文件夹路径，后面的/一定要带上txtPath = "model/YOLOX/datasets/VOC/VOCdevkit/VOC2007/labels/lables/"  # txt所在文件夹路径，后面的/一定要带上xmlPath = "model/YOLOX/datasets/VOC/VOCdevkit/VOC2007/Annotations/"  # xml文件保存路径，后面的/一定要带上makexml(picPath, txtPath, xmlPath)

以上代码只需要依照自身情况对dic、picPath、txtPath、xmlPath进行更改即可转换。以上是yolo格式转voc格式

五、yolo转coco

"""
YOLO 格式的数据集转化为 COCO 格式的数据集
--root_path 输入根路径
"""import os
import cv2
import json
from tqdm import tqdm
import argparse
import globparser = argparse.ArgumentParser("ROOT SETTING")
parser.add_argument('--root_path', type=str, default='coco', help="root path of images and labels")
arg = parser.parse_args()# 默认划分比例为 8:1:1。 第一个划分点在8/10处，第二个在9/10。
VAL_SPLIT_POINT = 4 / 5
TEST_SPLIT_POINT = 9 / 10root_path = arg.root_path
print(root_path)# 原始标签路径
originLabelsDir = os.path.join(root_path, 'labels/*/*.txt')
# 原始标签对应的图片路径
originImagesDir = os.path.join(root_path, 'images/*/*.jpg')
# dataset用于保存所有数据的图片信息和标注信息
train_dataset = {'categories': [], 'annotations': [], 'images': []}
val_dataset = {'categories': [], 'annotations': [], 'images': []}
test_dataset = {'categories': [], 'annotations': [], 'images': []}# 打开类别标签
with open(os.path.join(root_path, 'classes.txt')) as f:classes = f.read().strip().split()# 建立类别标签和数字id的对应关系
for i, cls in enumerate(classes, 1):train_dataset['categories'].append({'id': i, 'name': cls, 'supercategory': 'fish'})val_dataset['categories'].append({'id': i, 'name': cls, 'supercategory': 'fish'})test_dataset['categories'].append({'id': i, 'name': cls, 'supercategory': 'fish'})# 读取images文件夹的图片名称
indexes = glob.glob(originImagesDir)
print(len(indexes))
# ---------------接着将，以上数据转换为COCO所需要的格式---------------
for k, index in enumerate(tqdm(indexes)):txtFile = index.replace('images', 'labels').replace('jpg', 'txt')# 用opencv读取图片，得到图像的宽和高im = cv2.imread(index)H, W, _ = im.shape# 切换dataset的引用对象，从而划分数据集if k + 1 > round(len(indexes) * VAL_SPLIT_POINT):if k + 1 > round(len(indexes) * TEST_SPLIT_POINT):dataset = test_datasetelse:dataset = val_datasetelse:dataset = train_dataset# 添加图像的信息到dataset中if (os.path.exists(txtFile)):with open(txtFile, 'r') as fr:dataset['images'].append({'file_name': index.replace("\\", "/"),'id': k,'width': W,'height': H})labelList = fr.readlines()for label in labelList:label = label.strip().split()x = float(label[1])y = float(label[2])w = float(label[3])h = float(label[4])# convert x,y,w,h to x1,y1,x2,y2# imagePath = os.path.join(originImagesDir,#                            txtFile.replace('txt', 'jpg'))image = cv2.imread(index)x1 = (x - w / 2) * Wy1 = (y - h / 2) * Hx2 = (x + w / 2) * Wy2 = (y + h / 2) * Hx1 = int(x1)y1 = int(y1)x2 = int(x2)y2 = int(y2)# 为了与coco标签方式对，标签序号从1开始计算cls_id = int(label[0]) + 1width = max(0, x2 - x1)height = max(0, y2 - y1)dataset['annotations'].append({'area': width * height,'bbox': [x1, y1, width, height],'category_id': int(cls_id),'id': i,'image_id': k,'iscrowd': 0,# mask, 矩形是从左上角点按顺时针的四个顶点'segmentation': [[x1, y1, x2, y1, x2, y2, x1, y2]]})# print(dataset)# breakelse:continue# 保存结果的文件夹
folder = os.path.join(root_path, 'annotations')
if not os.path.exists(folder):os.makedirs(folder)
for phase in ['train', 'val', 'test']:json_name = os.path.join(root_path, 'annotations/{}.json'.format(phase))with open(json_name, 'w', encoding="utf-8") as f:if phase == 'train':json.dump(train_dataset, f, ensure_ascii=False, indent=1)if phase == 'val':json.dump(val_dataset, f, ensure_ascii=False, indent=1)if phase == 'test':json.dump(test_dataset, f, ensure_ascii=False, indent=1)

六、人脸数据集转yolo

from xml.dom.minidom import Document
import os
import cv2def convert(size, box):x_center = (float(box[2]) + float(box[0])) / 2.0y_center = (float(box[3]) + float(box[1])) / 2.0# 归一化x = x_center / size[0]y = y_center / size[1]# 求宽高并归一化w = (float(box[2]) - float(box[0])) / size[0]h = (float(box[3]) - float(box[1])) / size[1]return (x, y, w, h)def makexml(picPath, facePath, txtPath):  dic = {'0': "0",  # 创建字典用来对类型进行转换}files = os.listdir(facePath)#    print("1", files)for i, name in enumerate(files):txtFile = open(facePath + name)txtList = txtFile.readlines()print("name", name)img = cv2.imread(picPath + name[0:-4] + ".png")Pheight, Pwidth, Pdepth = img.shapeyolo_txt_path = os.path.join(txtPath, name.split(".")[0]+ ".txt")with open(yolo_txt_path, 'w') as f:for j in txtList:box = j.strip().split(" ")if len(j) < 4:passelse:boxex = convert((Pwidth, Pheight), box)# yolo标准格式类别 x_center,y_center,width,heightf.write("0" + " " + " ".join([str(s) for s in boxex]) + '\n')if __name__ == "__main__":picPath = "model/datasets/DarkFace_Train_2021/image/"  facePath = "model/datasets/DarkFace_Train_2021/label/"  txtPath = "model/datasets/DarkFace_Train_2021/labels/"  makexml(picPath, facePath, txtPath)

七、LEVIE数据集转yolo

import os
import cv2def convert(size, box):x_center = (int(box[3]) + int(box[1])) / 2.0y_center = (int(box[4]) + int(box[2])) / 2.0# 归一化x = x_center / int(size[0])y = y_center / int(size[1])# 求宽高并归一化w = (int(box[3]) - int(box[1])) / size[0]h = (int(box[4]) - int(box[2])) / size[1]return (int(box[0]), x, y, w, h)def makexml(picPath, txtPath, yolo_paths):  # txt所在文件夹路径，yolo文件保存路径，图片所在文件夹路径"""此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件"""files = os.listdir(txtPath)for i, name in enumerate(files):yolo_txt_path = os.path.join(yolo_paths, name.split(".")[0]+ ".txt")txtFile = open(txtPath + name)with open(yolo_txt_path, 'w') as f:txtList = txtFile.readlines()img = cv2.imread(picPath + name[0:-4] + ".jpg")Pheight, Pwidth, _ = img.shapefor j in txtList:oneline = j.strip().split(" ")obj = oneline[0]xmin = oneline[1]if int(xmin) < 0 :xmin = "1"            ymax = oneline[2]if int(ymax) < 0 :ymax = "1"xmax = oneline[3]ymin = oneline[4]box = convert((Pwidth, Pheight), oneline)f.write(str(box[0]) + " " + str(box[1]) + " " + str(box[2]) + " " + str(box[3]) + " " + str(box[4]) + '\n')if __name__ == "__main__":picPath = "./out/"  # 图片所在文件夹路径，后面的/一定要带上txtPath = "./labels/"  # txt所在文件夹路径，后面的/一定要带上yolo = "./xml/"  # xml文件保存路径，后面的/一定要带上makexml(picPath, txtPath, yolo)

八、NWPU VHR-10 dataset

import os
import cv2def convert(size, box):x_center = (int(box[3]) + int(box[1])) / 2.0y_center = (int(box[4]) + int(box[2])) / 2.0# 归一化x = x_center / int(size[0])y = y_center / int(size[1])# 求宽高并归一化w = (int(box[3]) - int(box[1])) / size[0]h = (int(box[4]) - int(box[2])) / size[1]return (int(box[0]), x, y, w, h)def makexml(picPath, txtPath, yolo_paths):  # txt所在文件夹路径，yolo文件保存路径，图片所在文件夹路径"""此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件"""files = os.listdir(txtPath)for i, name in enumerate(files):print(name)yolo_txt_path = os.path.join(yolo_paths, name.split(".")[0]+ ".txt")txtFile = open(txtPath + name)with open(yolo_txt_path, 'w') as f:txtList = txtFile.readlines()img = cv2.imread(picPath + name[0:-4] + ".jpg")Pheight, Pwidth, _ = img.shapefor j in txtList:oneline = j.strip().split(",")a = int(oneline[4])b = int(oneline[0][1:])c = int(oneline[1][:-1])d = int(oneline[2][1:])e = int(oneline[3][:-1])oneline = (int(oneline[4]), int(oneline[0][1:]), int(oneline[1][:-1]), int(oneline[2][1:]), int(oneline[3][:-1]))box = convert((Pwidth, Pheight), oneline)f.write(str(box[0]) + " " + str(box[1]) + " " + str(box[2]) + " " + str(box[3]) + " " + str(box[4]) + '\n')if __name__ == "__main__":picPath = "./image/"  # 图片所在文件夹路径，后面的/一定要带上txtPath = "./txt/"  # txt所在文件夹路径，后面的/一定要带上yolo = "./xml/"  # xml文件保存路径，后面的/一定要带上makexml(picPath, txtPath, yolo)

九、UCAS_AOD

import os
import cv2
import mathdef convert(size, box):x_center = box[1] + box[3] / 2.0y_center = box[2] + box[4] / 2.0# 归一化x = x_center / int(size[0])y = y_center / int(size[1])# 求宽高并归一化w = box[3] / size[0]h = box[4] / size[1]return (int(box[0]), x, y, w, h)def fun(str_num):before_e = float(str_num.split('e')[0])sign = str_num.split('e')[1][:1]after_e = int(str_num.split('e')[1][1:])if sign == '+':float_num = before_e * math.pow(10, after_e)elif sign == '-':float_num = before_e * math.pow(10, -after_e)else:float_num = Noneprint('error: unknown sign')return float_numdef makexml(picPath, txtPath, yolo_paths):  # txt所在文件夹路径，yolo文件保存路径，图片所在文件夹路径"""此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件"""files = os.listdir(txtPath)for i, name in enumerate(files):print(name)yolo_txt_path = os.path.join(yolo_paths, name.split(".")[0]+ ".txt")txtFile = open(txtPath + name)with open(yolo_txt_path, 'w') as f:txtList = txtFile.readlines()img = cv2.imread(picPath + name[0:-4] + ".png")Pheight, Pwidth, _ = img.shapefor j in txtList:oneline = j.strip().split("\t")try:int(oneline[9])except ValueError:a = fun(oneline[9])else:a = int(oneline[9])try:int(oneline[10])except ValueError:b = fun(oneline[10])else:b = int(oneline[10])  try:int(oneline[11])except ValueError:c = fun(oneline[11])else:c = int(oneline[11]) try:int(oneline[12])except ValueError:d = fun(oneline[12])else:d = int(oneline[12])                  oneline = (1, a, b, c, d)box = convert((Pwidth, Pheight), oneline)f.write(str(box[0]) + " " + str(box[1]) + " " + str(box[2]) + " " + str(box[3]) + " " + str(box[4]) + '\n')if __name__ == "__main__":picPath = "./CAR/"  # 图片所在文件夹路径，后面的/一定要带上txtPath = "./labels/"  # txt所在文件夹路径，后面的/一定要带上yolo = "./xml/"  # xml文件保存路径，后面的/一定要带上makexml(picPath, txtPath, yolo)
``在运行代码之前数据集根目录下放置classes.txt文件。只需要指定–root_path即可进行转换
以上部分感谢博主的分享：https://blog.csdn.net/qq_40502460/article/details/116564254

本文来自互联网用户投稿，文章观点仅代表作者本人，不代表本站立场，不承担相关法律责任。如若转载，请注明出处。 如若内容造成侵权/违法违规/事实不符，请点击【内容举报】进行投诉反馈！

标签：技术

上一篇 > 目标检测常用数据集格式转化voc yolo coco
下一篇 > past

Duilib中list控件支持ctrl和shif多行选中的实现

[ICML2015]Batch Normalization:Accelerating Deep Network Training by Reducing Internal Covariate Shif

win10系统微软输入法于eclipse ctrl+shif+f冲突间接处理办法

Codeforces Round #259 (Div. 2) B. Little Pony and Sort by Shif

读LDD3，内存映射与DMA--PAGE_SHIF…

VMware虚拟机安装XP【要先分区，再设置BOOT 启动CD，shif+上移】

更换iBus五笔的左与右Shif

sublime ctrl+shif+f 没用解决办法

idea 对 ctrl + z 的撤销是 ctrl + shif + z

计算机最早的设计师应用于,计算机应用基础选择题doc.doc

win10自带截图神器：Win+Shift+S

Python基础之文件目录操作

python简述目录_Python基础之文件目录操作(示例代码)

tp5 如何做数据采集

任务2-7(服务器字体+阿里巴巴矢量库)

html标签（1)：h1~h6,p,br,pre,hr

TI 电量计介绍与芯片选型指南

几款TI电源芯片简介

TI DSP芯片C2000系列读取FLASH数据

德州仪器(Ti)平台嵌入式开发基础

TI三相电机智能栅极驱动芯片特点分类

省选模拟（12.08） T3 圈圈圈圈圈圈圈圈

Hadoop生态圈技术栈（上）

大数据开发基础入门与项目实战（三）Hadoop核心及生态圈技术栈之6.Impala交互式查询

小猿圈之Linux下Mysql 操作命令

大数据Hadoop生态圈常用面试题

大数据开发基础入门与项目实战（三）Hadoop核心及生态圈技术栈之4.Hive DDL、DQL和数据操作

备战Noip2018模拟赛11（B组）T3 Monogatari 物语

【智能优化算法-圆圈搜索算法】基于圆圈搜索算法Circle Search Algorithm求解单目标优化问题附matlab代码

NYOJ 78 圈水池

递归问题跑道汽车绕圈问题 Python实现

Hadoop生态圈（三）：MapReduce