Our system allows you to choose the most profitable existing object recognition models based on user preferences and data. In addition to choosing the architecture of the model, the system will help you start training and configure the environment. The proposed recommendation system consists of several components that interact to generate recommendations for machine learning pipelines.
External parameters (received from users and third-party resources):
Input data
Yolov(5,7,8)
Fast R-CNN
SSD
RecommendationEngine (Datasets, Model)
Training (Recommendation)
Evaluation (Model)
Image data and user parameters
Knowledge
base
Array of
parameters
KNN algorithm
Best model and recommendation lists
Balance of speed and accuracy
GRU interface
Image size: 1920x1080
Unbalanced dataset
Count of images: 2200
Yolov5x
Yolov7x
Yolov5l
The recommendation algorithm is based on production rules. The primary set of rules (knowledge base) is formed on the basis of the results of the analysis of scientific sources and standard data sets, but also empirical processing of data sets from specific industries. The main criteria for drawing up the rules were chosen:
1. Dimension of the model
2. The value of metrics (mAP, Recall, Accuracy) for selected datasets
3. The speed of the model on GPU and CPU
4. Supported image format and dimension
Select ML model and user data
Selection of training parameters
Trained ML model and metrics
Download repository and install the necessary dependencies using the following commands:
git clone https://github.com/saaresearch/ODRS.git cd ODRS/ pip install -r requirements.txt
To use the recommendation system or train the desired detector, put your dataset in yolo format in the user_datasets/yolo directory. The set can have the following structures:
user_datasets |_ _yolo |_ _ <folder_name_your_dataset> |_ _train |_ _images |_ <name_1>.jpg |_ ... |_ <name_N>.jpg |_ _labels |_ <name_1>.txt |_ ... |_ <name_N>.txt |_ _valid |_ _images |_ <name_1>.jpg |_ ... |_ <name_N>.jpg |_ _labels |_ <name_1>.txt |_ ... |_ <name_N>.txt |_ _test |_ _images |_ <name_1>.jpg |_ ... |_ <name_N>.jpg |_ _labels |_ <name_1>.txt |_ ... |_ <name_N>.txt
or you can use the following structure, then your set will be automatically divided into samples:
user_datasets |_ _yolo |_ _ <folder_name_your_dataset> |_ <name_1>.jpg |_ ... |_ <name_N>.jpg |_ ... |_ <name_1>.txt |_ ... |_ <name_N>.txt
Add to the root directory of the project .txt a file containing the names of all classes in your set of images.
Example classes.txt:
boat car dock jetski lift
After you have placed your dataset in the folder user_datasets/yolo and created in the root directory .txt a file containing the names of all classes in your set of images. You can start working with the main functionality of the project.
In order to use the recommendation system, you need to configureml_config.yaml. Go to the desired directory:
cd ODRS/ml_utils/config/
Open ml_config.yaml and set the necessary parameters and paths:
#dataset_path: path to data folder #classes_path: path to classes.txt #GPU: True/False #speed: 1 - 5 if you want max speed choose 5. For lower speed 1 #accuracy: 1 - 10 if you want max accuracy choose 10. For lower accuracy 1 GPU: true accuracy: 10 classes_path: classes.txt dataset_path: /media/farm/ssd_1_tb_evo_sumsung/ODRS/user_datasets/yolo/plant speed: 1
Go to the script ml_model_optimizer.py and start it:
cd .. python ml_model_optimizer.py
If everything worked successfully, you will see something like the following answer:
Number of images: 3496 Width: 960 Height: 540 Gini Coefficient: 94.0 Number of classes: 28 Top models for training: 1) yolov7 2) yolov8x6 3) yolov7x
Go to the directory containing custom_config.yaml in which the training parameters are specified.
Setting up training parameters:
# Name *.txt file with names classes CLASSES: classes.txt # This file generated automaticaly CONFIG_PATH: dataset.yaml # Path to data DATA_PATH: /media/farm/ssd_1_tb_evo_sumsung/ODRS/user_datasets/yolo/plant EPOCHS: 2 IMG_SIZE: 300 # MODEL ZOO: # ["yolov5l", "yolov5m", "yolov5n", "yolov5s", "yolov5x", # "yolov7x", "yolov7", "yolov7-tiny", #"yolov8x6", "yolov8x", # "yolov8s", "yolov8n", "yolov8m", "faster-rcnn", "ssd"] # NOTE: For successful training of the ssd model, the size of your images should not exceed 512x512 MODEL: ssd # For multiprocessing. # For CPU: # GPU_COUNT: 0 # SELECT_GPU: cpu GPU_COUNT: 2 SELECT_GPU: 0,1 # parameters for autosplit dataset SPLIT_TRAIN_VALUE: 0.6 SPLIT_VAL_VALUE: 0.35
Starting training: NOTE: If, for example, you specified in custom_config.yaml, the path to the yolov5 model, and you want to start yolov8, training will not start.
cd ODRS/ODRS/train_utils/train_model python custom_train_all.py
After the training, you will see in the root directory ODRS a new directory runs, all the results of experiments will be saved in it. For convenience, the result of each experiment is saved in a separate folder in the following form:
<year>-<mounth>-<day>_<hours>-<minutes>-<seconds>_<model_name> |_ _exp |_...
To use the project in your code, you can use the built-in Api. You can see full examples of using the API here: Example API.
Initializing a task:
from ODRS.ODRS.api.ODRS import ODRS #init object with parameters odrs = ODRS(job="object_detection", data_path = 'full_data_path', classes = "classes.txt", img_size = "512", batch_size = "25", epochs = "300", model = 'yolov8x6', gpu_count = 1, select_gpu = "0", config_path = "dataset.yaml", split_train_value = 0.6, split_val_value = 0.35)
Starting training:
from ODRS.ODRS.api.ODRS import ODRS odrs.fit()
Getting results:
!yolo val detect data=path_to_data device=0 model=ODRS/runs/path_to_experiment/best.pt
Example results:
This project is actively used in testing new models and datasets in Insystem for classification and detection of garbage.
Insystem