中文場景辨識 - txt 跟 json 檔互轉

# 中文場景辨識 - txt 跟 json 檔互轉本次比賽提供的 label 檔為 JSON 格式，以下列舉兩個 JSON 檔供參考： - 此為官方提供的 img_1.json ```json= {"version": "4.5.7", "flags": {}, "shapes": [{"label": "髮型工作室", "points": [[805, 617], [1052, 613], [1059, 662], [809, 665]], "group_id": 0, "shape_type": "polygon", "flags": {}}, {"label": "漫", "points": [[694, 557], [773, 558], [797, 667], [692, 666]], "group_id": 4, "shape_type": "polygon", "flags": {}}, {"label": "", "points": [[701, 1018], [708, 1016], [708, 1024], [701, 1024]], "group_id": 5, "shape_type": "polygon", "flags": {}}, {"label": "SPA", "points": [[694, 1010], [708, 1008], [711, 1015], [695, 1019]], "group_id": 2, "shape_type": "polygon", "flags": {}}, {"label": "黛安娜", "points": [[694, 956], [707, 956], [708, 1008], [692, 1011]], "group_id": 0, "shape_type": "polygon", "flags": {}}, {"label": "髮", "points": [[806, 620], [841, 620], [851, 666], [810, 664]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "作", "points": [[957, 617], [1001, 615], [1004, 664], [961, 662]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "工", "points": [[908, 625], [943, 619], [950, 655], [914, 658]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "室", "points": [[1012, 616], [1052, 616], [1058, 660], [1012, 663]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "型", "points": [[852, 623], [892, 618], [895, 663], [860, 664]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "娜", "points": [[693, 994], [706, 992], [709, 1007], [695, 1011]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "安", "points": [[694, 976], [707, 974], [708, 991], [694, 993]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "黛", "points": [[694, 957], [707, 957], [708, 973], [695, 974]], "group_id": 1, "shape_type": "polygon", "flags": {}}], "imagePath": "img_1.jpg", "imageData": null, "imageHeight": "1024", "imageWidth": "1365"} ``` ![](https://i.imgur.com/aCL0VPz.jpg) - 此為官方提供的 img_5.json ```json= {"version": "4.5.7", "flags": {}, "shapes": [{"label": "###", "points": [[1034, 1006], [1151, 955], [1171, 1024], [1043, 1024]], "group_id": 255, "shape_type": "polygon", "flags": {}}, {"label": "", "points": [[915, 946], [956, 944], [956, 959], [916, 959]], "group_id": 5, "shape_type": "polygon", "flags": {}}, {"label": "", "points": [[911, 50], [962, 2], [1004, 234], [940, 228]], "group_id": 5, "shape_type": "polygon", "flags": {}}, {"label": "永珍銀樓", "points": [[88, 280], [1262, 350], [1327, 573], [44, 527]], "group_id": 0, "shape_type": "polygon", "flags": {}}, {"label": "", "points": [[920, 962], [955, 962], [956, 974], [920, 976]], "group_id": 5, "shape_type": "polygon", "flags": {}}, {"label": "永", "points": [[81, 281], [373, 298], [358, 541], [42, 531]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "珍", "points": [[411, 300], [704, 318], [701, 552], [396, 541]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "銀", "points": [[744, 328], [1016, 339], [1044, 563], [744, 552]], "group_id": 1, "shape_type": "polygon", "flags": {}}, {"label": "樓", "points": [[1042, 336], [1262, 351], [1331, 574], [1081, 566]], "group_id": 1, "shape_type": "polygon", "flags": {}}], "imagePath": "img_5.jpg", "imageData": null, "imageHeight": "1024", "imageWidth": "1365"} ``` ![](https://i.imgur.com/hhcOjuv.jpg) --- 但是！因為我們是將影像丟進 YOLO_v4 進行訓練，而 YOLO 所吃的格式為 txt 檔，如下所示： :::info txt 檔最前面的單個數字對應到的是 JSON 檔中的 "group_id"，例如：JSON 檔中的 "group_id" = 0，txt 第一碼數字就會等於 0，以此類推。但是，由於 YOLO 輸入格式的關係，JSON 檔中 "group_id" = 255 會在 txt 檔中以數字 6 代替，而非 255，所以在下列格式中會看到原先應為 255 的在 txt 中以 6 顯示。 ::: - 1.txt ``` 0 0.6827838827838828 0.6240234375 0.18608058608058609 0.05078125 4 0.5454212454212454 0.59765625 0.07692307692307693 0.107421875 5 0.5161172161172162 0.99609375 0.005128205128205128 0.0078125 2 0.5146520146520146 0.98974609375 0.012454212454212455 0.0107421875 0 0.5128205128205128 0.96044921875 0.011721611721611722 0.0537109375 1 0.6069597069597069 0.6279296875 0.03296703296703297 0.044921875 1 0.7183150183150183 0.62451171875 0.034432234432234435 0.0478515625 1 0.6805860805860806 0.62353515625 0.03076923076923077 0.0380859375 1 0.7582417582417582 0.62451171875 0.0336996336996337 0.0458984375 1 0.6399267399267399 0.6259765625 0.0315018315018315 0.044921875 1 0.5135531135531135 0.97802734375 0.011721611721611722 0.0185546875 1 0.5135531135531135 0.96044921875 0.010256410256410256 0.0185546875 1 0.5135531135531135 0.94287109375 0.010256410256410256 0.0166015625 ``` - 5.txt ``` 6 0.8076923076923077 0.96630859375 0.10036630036630037 0.0673828125 5 0.6853479853479854 0.92919921875 0.030036630036630037 0.0146484375 5 0.7014652014652014 0.115234375 0.06813186813186813 0.2265625 0 0.5021978021978022 0.41650390625 0.9399267399267399 0.2861328125 5 0.6871794871794872 0.9462890625 0.026373626373626374 0.013671875 1 0.152014652014652 0.4013671875 0.2424908424908425 0.25390625 1 0.40293040293040294 0.416015625 0.22564102564102564 0.24609375 1 0.654945054945055 0.43505859375 0.21978021978021978 0.2294921875 1 0.8692307692307693 0.4443359375 0.21172161172161172 0.232421875 ``` --- 所以，看完上述相信大家已經明白我們要做什麼了吧！沒錯！就是要將 JSON 檔轉為 txt，這樣才能餵給 YOLO 進行訓練。BUT!!! 這個部分我已經完成了，也順利進行訓練當中，那麼這篇文章到底要幹嘛呢？因為此競賽最後要繳交的檔案規定如下圖所示 ![](https://i.imgur.com/9Lo19gw.png) 上傳格式的部分，我們所擁有的座標位置是以 `(x0, y0, x2, y2)` 表示，而要繳交的資料則是 `(x0, y0, x1, y1, x2, y2, x3, y3)` 且要為順時針座標。 ![](https://i.imgur.com/tQCeruq.png) 所以需要 **將 YOLO 輸出的座標位置再轉回比賽要繳交的座標位置** 舉例來說： YOLO 輸出的座標位置為：`[50., 281., 359., 539.]` 需要將他轉換成題目要的座標，也就是：`[50, 281, 359, 281, 359, 539, 50, 539]` 這個座標 --- 以下提供我將 JSON 轉換為 txt 的 Python 檔： ```python= import json import os import numpy as np import cv2 json_root = '../data/train/json/' txt_root = '../data/train/txt/' files = os.listdir(json_root) obj_names = '../data/train/obj.names' f_names = open(obj_names, 'a') names = [] print(len(files)) for i in range(len(files)): name = files[i].split('.')[0] json_name = os.path.join(json_root, files[i]) txt_name = os.path.join(txt_root, name+'.txt') f_json = open(json_name, 'r', encoding='utf-8') f_txt = open(txt_name, 'w') data = json.load(f_json) shapes = data['shapes'] ww = float(data['imageWidth']) hh = float(data['imageHeight']) for j in range(len(shapes)): # 將 JSON 中 'group_id'=255 的轉換為 6 if shapes[j]['group_id'] == 255: shapes[j]['group_id'] = 6 # 製作 obj.names 檔案 if shapes[j]['group_id'] in names: continue else: f_names.write(str(shapes[j]['group_id']) + '\n') names.append(shapes[j]['group_id']) # 將 JSON 轉換為 txt points = shapes[j]['points'] points = np.array(points) xs = points[:, 0].astype('float') ys = points[:, 1].astype('float') x_max = xs.max() x_min = xs.min() y_max = ys.max() y_min = ys.min() x_center = (x_min + x_max)/2 y_center = (y_min + y_max)/2 w = x_max - x_min h = y_max - y_min f_txt.write(str(shapes[j]['group_id']) + ' ') f_txt.write(str(x_center / ww) + ' ') f_txt.write(str(y_center / hh) + ' ') f_txt.write(str(w / ww) + ' ') f_txt.write(str(h / hh) + '\n') f_txt.close() f_json.close() f_names.close() ``` ###### tags: `T-Brain`