Building a YOLOX Plate Detector – Setup, Fine-Tuning, Metrics, Dashcam inference

Links

Notes and Code

Install YOLOX and needed deps

Download YOLOX GitHub repo: https://github.com/Megvii-BaseDetection/YOLOX

Then be sure you are using Python <= 3.11 (I had some issues with python 3.12 and switching to 3.11 made all much easier).

Edit in requirements.txt and set onnx-simplifier version to 0.3.10 and pycocotools to 2.0.8

numpy
torch>=1.7
opencv_python
loguru
tqdm
torchvision
thop
ninja
tabulate
psutil
tensorboard

# verified versions
# pycocotools corresponds to https://github.com/ppwwyyxx/cocoapi
pycocotools>=2.0.8
onnx>=1.13.0
onnx-simplifier==0.3.10
pip install -r requirements.txt
pip install torch==1.13.1
pip install -v -e .

You should now have everything installed to run the YOLOX training script.

Car plates dataset

Download the Roboflow car plates dataset from here in COCO format. Once unzipped into the YOLOX/datasets/ directory, rename the train, valid, test, directories to train2017, val2017, test2017.

In each directory there is an _annotation.json file. Copy the files the train2017, val2017 and test2017 directories to a separate annotations directory

# in the YOLOX/datasets directory
train2017/_annotations.coco.json -> annotations/train_annotations.coco.json
val2017/_annotations.coco.json -> annotations/val_annotations.coco.json
test2017/_annotations.coco.json -> annotations/test_annotations.coco.json

In these annotation files there are two categories: the actual category with name "License_Plate", with ID 1, is the one we want and its parent category with ID 0, which we will remove.

{
  "info": {...},
  "licenses": [...],
  "categories": [
    {
      "id": 0,
      "name": "license-plates",
      "supercategory": "none"
    },
    {
      "id": 1,
      "name": "License_Plate",
      "supercategory": "license-plates"
    }
  ],
  ...
}

Becomes

{
  "info": {...},
  "licenses": [...],
  "categories": [
    {
      "id": 1,
      "name": "plate",
      "supercategory": "none"
    }
  ],
  ...
}

Do this for all the 3 annotations files.

Fine-tuning yolox-s model for car plate detection

Download one of the yolox models, I’ll use small model yolox_s.pth which is ~9M params. This is the model we are going to fine-tune.

We need an experiment file, which is basically the python file that defines some details of the training

# plates.py
import os

from yolox.exp import Exp as MyExp


class Exp(MyExp):
    def __init__(self):
        super(Exp, self).__init__()
        self.depth = 0.33
        self.width = 0.50
        self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]

        # Define yourself dataset path
        self.data_dir = "datasets"
        self.train_ann = "train_annotations.coco.json"
        self.val_ann = "val_annotations.coco.json"
        self.test_ann = "test_annotations.coco.json"

        self.num_classes = 1

        self.max_epoch = 30
        self.data_num_workers = 8
        self.eval_interval = 1

  • depth and width: these are really important because they define the network size. depth=0.33, width = 0.50 are what suggested for the small model, for X (extra large 99M params model) it’s suggested to use depth=1.33 and width=1.25. You find other yolox model sizes under exps/default.
  • max_epoch: we run the training for 30 epochs
  • num_classes: number of classes – number of output object classes. In our case 1, just car license plate.
  • train_ann, val_ann, test_ann: the annotation file names in datasets/annotations
  • data_dir: by default datasets, where you have annoations, train2017, val2017 and test2017 folders.
python tools/train.py -f plates.py -d 1 -b 64 -o -c yolox_s.pth --cache
  • -d 1 – in case you are have a GPU
  • -b 64 – batch size, 64 images per batch
  • --cache – use it only if you have enough RAM to store all the dataset into memory. It will make the training faster.

Once completed you’ll find the model output in YOLOX_outputs/plates/. You’ll find many pth files, checkpoints, last epoch etc. You also have best_ckpt.pth which is the version that achieved the best accuracy (I think it’s based on mAP@50:95).

Export to ONNX

We are going to use tools/export_onnx.py but to make it work we need to make two changes.

  1. Pass weights_only=False to torch.load() function.
    torch.load(...) -> torch.load(..., weights_only=False)
  2. Rename _export to export.
    torch.onnx._export(...) -> torch.onnx.export(...)

Now the script should work with the dependencies we’ve installed before.

python tools/export_onnx.py -c YOLOX_outputs/plates/best_ckpt.pth --batch-size 1 -f ./plates.py  -o 12

If it doesn’t find plates.py, copy plates.py to exps/default and run

python tools/export_onnx.py -c YOLOX_outputs/plates/best_ckpt.pth --batch-size 1 -f plates  -o 12

If successful, you’ll find a yolox.onnx file the project folder.

Model evaluation

You can use the tools/demo.py script to run the fine-tuned model to detect plates in a video. It will output the same video with bounding boxes, and probabilities, over detected plates.

Again, to make tools/demo.py work you’ll probably need to edit it, passing weights_only=False to where torch.load() is called.

python tools/demo.py video \
  -f plates.py \
  -c YOLOX_outputs/plates/best_ckpt.pth \
  --path YOUR_VIDEO.mp4 \
  --conf 0.25 \
  --nms 0.45 \
  --tsize 640 \
  --device gpu \
  --save_result 
     

You’ll find the result under YOLOX_outputs/plates/vis_res