Open source recommendation system for training object detection models

Try ODRS for free

ABOUT

Our system allows you to choose the most profitable existing object recognition models based on user preferences and data. In addition to choosing the architecture of the model, the system will help you start training and configure the environment. The proposed recommendation system consists of several components that interact to generate recommendations for machine learning pipelines.

External parameters (received from users and third-party resources):

  • Dataset: Represents input data (video frames) and associated metadata (e.g. image size, quality, number of objects).
  • Model: Framework provides an opportunity to train the most popular object recognition models (including setting up the environment and choosing the architecture of a specific model). Considered two-stage detectors models such as Faster R-CNN and Mask R-CNN as well as one-stage detectors such as SSD and YOLO (including families v5, v7, v8).
Models
Faster RCNN
Mask RCNN
YOLO
SSD
Dataset

Input data

Model

Yolov(5,7,8)

Fast R-CNN

SSD

The properties of sets and models are used to generate recommendations

RecommendationEngine (Datasets, Model)

Recommended models are trained

Training (Recommendation)

Assessment of the quality of the model

Evaluation (Model)

Internal components

RecommendationEngine: generates recommendations based on user data and dataset characteristics

Recommendation system algorithm

Input parameters

Image data and user parameters

Data preprocessing
Finding the optional solutions

Knowledge
base

Array of
parameters

KNN algorithm

Forming a solution
Output parameters

Best model and recommendation lists

Example of usage

if

Balance of speed and accuracy

GRU interface

Image size: 1920x1080

Unbalanced dataset

Count of images: 2200

Recommendation process
then

Yolov5x

Yolov7x

Yolov5l

The recommendation algorithm is based on production rules. The primary set of rules (knowledge base) is formed on the basis of the results of the analysis of scientific sources and standard data sets, but also empirical processing of data sets from specific industries. The main criteria for drawing up the rules were chosen:

1. Dimension of the model
2. The value of metrics (mAP, Recall, Accuracy) for selected datasets
3. The speed of the model on GPU and CPU
4. Supported image format and dimension

Training: training of models proposed by the system

Input parameters

Select ML model and user data

Install requirements
Customization

Selection of training parameters

Train process
Output parameters

Trained ML model and metrics

Evaluation: evaluation of the quality of training models

Installation

Download repository and install the necessary dependencies using the following commands:

git clone https://github.com/saaresearch/ODRS.git
cd ODRS/
pip install -r requirements.txt 

Dataset structure

To use the recommendation system or train the desired detector, put your dataset in yolo format in the user_datasets/yolo directory. The set can have the following structures:

user_datasets
|_ _yolo
	|_ _ <folder_name_your_dataset>
		|_ _train
			|_ _images
				|_ <name_1>.jpg
				|_ ...
				|_ <name_N>.jpg
			|_ _labels
				|_ <name_1>.txt
				|_ ...
				|_ <name_N>.txt
		|_ _valid
			|_ _images
				|_ <name_1>.jpg
				|_ ...
				|_ <name_N>.jpg
			|_ _labels
				|_ <name_1>.txt
				|_ ...
				|_ <name_N>.txt
		|_ _test
			|_ _images
				|_ <name_1>.jpg
				|_ ...
				|_ <name_N>.jpg
			|_ _labels
				|_ <name_1>.txt
				|_ ...
				|_ <name_N>.txt

or you can use the following structure, then your set will be automatically divided into samples:

user_datasets
|_ _yolo
	|_ _ <folder_name_your_dataset>
		|_ <name_1>.jpg
		|_ ...
		|_ <name_N>.jpg
		|_ ...
		|_ <name_1>.txt
		|_ ...
		|_ <name_N>.txt

Add to the root directory of the project .txt a file containing the names of all classes in your set of images.

Example classes.txt:

boat
car
dock
jetski
lift

ML Recommendation system

After you have placed your dataset in the folder user_datasets/yolo and created in the root directory .txt a file containing the names of all classes in your set of images. You can start working with the main functionality of the project.


In order to use the recommendation system, you need to configureml_config.yaml. Go to the desired directory:

cd ODRS/ml_utils/config/

Open ml_config.yaml and set the necessary parameters and paths:

#dataset_path: path to data folder
#classes_path: path to classes.txt
#GPU: True/False
#speed: 1 - 5 if you want max speed choose 5. For lower speed 1
#accuracy: 1 - 10 if you want max accuracy choose 10. For lower accuracy 1


GPU: true
accuracy: 10
classes_path: classes.txt
dataset_path: /media/farm/ssd_1_tb_evo_sumsung/ODRS/user_datasets/yolo/plant
speed: 1 

Detectors Training

Go to the script ml_model_optimizer.py and start it:

cd ..
python ml_model_optimizer.py

If everything worked successfully, you will see something like the following answer:

Number of images: 3496
Width: 960
Height: 540
Gini Coefficient: 94.0
Number of classes: 28
Top models for training:
1) yolov7
2) yolov8x6
3) yolov7x

Go to the directory containing custom_config.yaml in which the training parameters are specified.


Setting up training parameters:

#  Name *.txt file with names classes
CLASSES: classes.txt

# This file generated automaticaly
CONFIG_PATH: dataset.yaml

# Path to data
DATA_PATH: /media/farm/ssd_1_tb_evo_sumsung/ODRS/user_datasets/yolo/plant

EPOCHS: 2
IMG_SIZE: 300

# MODEL ZOO:
# ["yolov5l", "yolov5m", "yolov5n", "yolov5s", "yolov5x",
#  "yolov7x", "yolov7", "yolov7-tiny", #"yolov8x6", "yolov8x",
#  "yolov8s", "yolov8n", "yolov8m", "faster-rcnn", "ssd"]

# NOTE: For successful training of the ssd model, the size of your images should not exceed 512x512

MODEL: ssd


# For multiprocessing.
# For CPU:
#       GPU_COUNT: 0
#       SELECT_GPU: cpu

GPU_COUNT: 2
SELECT_GPU: 0,1

# parameters for autosplit dataset
SPLIT_TRAIN_VALUE: 0.6
SPLIT_VAL_VALUE: 0.35

Starting training: NOTE: If, for example, you specified in custom_config.yaml, the path to the yolov5 model, and you want to start yolov8, training will not start.

cd ODRS/ODRS/train_utils/train_model
python custom_train_all.py

After the training, you will see in the root directory ODRS a new directory runs, all the results of experiments will be saved in it. For convenience, the result of each experiment is saved in a separate folder in the following form:

<year>-<mounth>-<day>_<hours>-<minutes>-<seconds>_<model_name>
|_ _exp
	|_...

Using the API

To use the project in your code, you can use the built-in Api. You can see full examples of using the API here: Example API.


Initializing a task:

from ODRS.ODRS.api.ODRS import ODRS
#init object with parameters
odrs = ODRS(job="object_detection", data_path = 'full_data_path', classes = "classes.txt",
	img_size = "512", batch_size = "25", epochs = "300",
	model = 'yolov8x6', gpu_count = 1, select_gpu = "0", config_path = "dataset.yaml",
	split_train_value = 0.6, split_val_value = 0.35)

Starting training:

from ODRS.ODRS.api.ODRS import ODRS
odrs.fit()

Getting results:

!yolo val detect data=path_to_data device=0 model=ODRS/runs/path_to_experiment/best.pt

Example results:

firstsecond

This project is actively used in testing new models and datasets in Insystem for classification and detection of garbage.


Our team

Mikhail Gerasimchuk

ML Specialist

Artem Smetanin

ML Specialist

Saveli Rashin

Web developer

Sponsors

1
3

Insystem

2