Initial commit
@@ -0,0 +1,45 @@
|
|||||||
|
---
|
||||||
|
name: Bug Report or Feature Request
|
||||||
|
about: Create a report to help us improve
|
||||||
|
title: ''
|
||||||
|
labels: ''
|
||||||
|
assignees: ''
|
||||||
|
|
||||||
|
---
|
||||||
|
Before filing a report consider the following two questions:
|
||||||
|
|
||||||
|
### Have you followed the instructions exactly (word by word)?
|
||||||
|
|
||||||
|
### Have you checked the [troubleshooting](https://github.com/AntonMu/TrainYourOwnYOLO#troubleshooting) section?
|
||||||
|
|
||||||
|
Once you are familiar with the code, you're welcome to modify it. Please only continue to file a bug report if you encounter an issue with the provided code and after having followed the instructions.
|
||||||
|
|
||||||
|
If you have followed the instructions exactly, couldn't solve your problem with the provided troubleshooting tips and would still like to file a bug or make a feature requests please follow the steps below.
|
||||||
|
|
||||||
|
1. It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
|
||||||
|
2. **Every section** of the form below must be filled out.
|
||||||
|
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
### Readme
|
||||||
|
|
||||||
|
- I have followed all Readme instructions carefully word by word: **Answer**
|
||||||
|
|
||||||
|
### System information
|
||||||
|
- **What is the top-level directory of the model you are using**:
|
||||||
|
- **Have I written custom code (as opposed to using a stock example script provided in the repo)**:
|
||||||
|
- **OS Platform and Distribution (e.g., Linux Ubuntu 16.04)**:
|
||||||
|
- **TensorFlow version (use command below)**:
|
||||||
|
- **CUDA/cuDNN version**:
|
||||||
|
- **GPU model and memory**:
|
||||||
|
- **Exact command to reproduce**:
|
||||||
|
|
||||||
|
You can obtain the TensorFlow version with
|
||||||
|
|
||||||
|
`python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"`
|
||||||
|
|
||||||
|
### Describe the problem
|
||||||
|
Describe the problem clearly here. Be sure to convey here why it's a bug or a feature request.
|
||||||
|
|
||||||
|
### Source code / logs
|
||||||
|
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.
|
||||||
@@ -0,0 +1,135 @@
|
|||||||
|
# Keras model files
|
||||||
|
*.h5
|
||||||
|
*.weights
|
||||||
|
|
||||||
|
# tf event files
|
||||||
|
events.out.tfevents.*
|
||||||
|
|
||||||
|
# Byte-compiled / optimized / DLL files
|
||||||
|
__pycache__/
|
||||||
|
*.py[cod]
|
||||||
|
*$py.class
|
||||||
|
|
||||||
|
# All hidden files and
|
||||||
|
|
||||||
|
.vscode/
|
||||||
|
.vscode
|
||||||
|
|
||||||
|
# C extensions
|
||||||
|
*.so
|
||||||
|
|
||||||
|
# Distribution / packaging
|
||||||
|
.Python
|
||||||
|
build/
|
||||||
|
develop-eggs/
|
||||||
|
dist/
|
||||||
|
downloads/
|
||||||
|
eggs/
|
||||||
|
.eggs/
|
||||||
|
lib/
|
||||||
|
lib64/
|
||||||
|
parts/
|
||||||
|
sdist/
|
||||||
|
var/
|
||||||
|
wheels/
|
||||||
|
pip-wheel-metadata/
|
||||||
|
share/python-wheels/
|
||||||
|
*.egg-info/
|
||||||
|
.installed.cfg
|
||||||
|
*.egg
|
||||||
|
MANIFEST
|
||||||
|
|
||||||
|
# PyInstaller
|
||||||
|
# Usually these files are written by a python script from a template
|
||||||
|
# before PyInstaller builds the exe, so as to inject date/other infos into it.
|
||||||
|
*.manifest
|
||||||
|
*.spec
|
||||||
|
|
||||||
|
# Installer logs
|
||||||
|
pip-log.txt
|
||||||
|
pip-delete-this-directory.txt
|
||||||
|
|
||||||
|
# Unit test / coverage reports
|
||||||
|
htmlcov/
|
||||||
|
.tox/
|
||||||
|
.nox/
|
||||||
|
.coverage
|
||||||
|
.coverage.*
|
||||||
|
.cache
|
||||||
|
nosetests.xml
|
||||||
|
coverage.xml
|
||||||
|
*.cover
|
||||||
|
.hypothesis/
|
||||||
|
.pytest_cache/
|
||||||
|
|
||||||
|
# Translations
|
||||||
|
*.mo
|
||||||
|
*.pot
|
||||||
|
|
||||||
|
# Django stuff:
|
||||||
|
*.log
|
||||||
|
local_settings.py
|
||||||
|
db.sqlite3
|
||||||
|
|
||||||
|
# Flask stuff:
|
||||||
|
instance/
|
||||||
|
.webassets-cache
|
||||||
|
|
||||||
|
# Scrapy stuff:
|
||||||
|
.scrapy
|
||||||
|
|
||||||
|
# Sphinx documentation
|
||||||
|
docs/_build/
|
||||||
|
|
||||||
|
# PyBuilder
|
||||||
|
target/
|
||||||
|
|
||||||
|
# Jupyter Notebook
|
||||||
|
.ipynb_checkpoints
|
||||||
|
|
||||||
|
# IPython
|
||||||
|
profile_default/
|
||||||
|
ipython_config.py
|
||||||
|
|
||||||
|
# pyenv
|
||||||
|
.python-version
|
||||||
|
|
||||||
|
# pipenv
|
||||||
|
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
|
||||||
|
# However, in case of collaboration, if having platform-specific dependencies or dependencies
|
||||||
|
# having no cross-platform support, pipenv may install dependencies that don't work, or not
|
||||||
|
# install all needed dependencies.
|
||||||
|
#Pipfile.lock
|
||||||
|
|
||||||
|
# celery beat schedule file
|
||||||
|
celerybeat-schedule
|
||||||
|
|
||||||
|
# SageMath parsed files
|
||||||
|
*.sage.py
|
||||||
|
|
||||||
|
# Environments
|
||||||
|
.env
|
||||||
|
.venv
|
||||||
|
env/
|
||||||
|
venv/
|
||||||
|
ENV/
|
||||||
|
env.bak/
|
||||||
|
venv.bak/
|
||||||
|
|
||||||
|
# Spyder project settings
|
||||||
|
.spyderproject
|
||||||
|
.spyproject
|
||||||
|
|
||||||
|
# Rope project settings
|
||||||
|
.ropeproject
|
||||||
|
|
||||||
|
# mkdocs documentation
|
||||||
|
/site
|
||||||
|
|
||||||
|
# mypy
|
||||||
|
.mypy_cache/
|
||||||
|
.dmypy.json
|
||||||
|
dmypy.json
|
||||||
|
|
||||||
|
# Pyre type checker
|
||||||
|
.pyre/
|
||||||
@@ -0,0 +1,80 @@
|
|||||||
|
from PIL import Image
|
||||||
|
from os import path, makedirs
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import pandas as pd
|
||||||
|
import sys
|
||||||
|
import argparse
|
||||||
|
|
||||||
|
|
||||||
|
def get_parent_dir(n=1):
|
||||||
|
""" returns the n-th parent dicrectory of the current
|
||||||
|
working directory """
|
||||||
|
current_path = os.path.dirname(os.path.abspath(__file__))
|
||||||
|
for k in range(n):
|
||||||
|
current_path = os.path.dirname(current_path)
|
||||||
|
return current_path
|
||||||
|
|
||||||
|
|
||||||
|
sys.path.append(os.path.join(get_parent_dir(1), "Utils"))
|
||||||
|
from Convert_Format import convert_vott_csv_to_yolo
|
||||||
|
|
||||||
|
Data_Folder = os.path.join(get_parent_dir(1), "Data")
|
||||||
|
VoTT_Folder = os.path.join(
|
||||||
|
Data_Folder, "Source_Images", "Training_Images", "vott-csv-export"
|
||||||
|
)
|
||||||
|
VoTT_csv = os.path.join(VoTT_Folder, "Annotations-export.csv")
|
||||||
|
YOLO_filename = os.path.join(VoTT_Folder, "data_train.txt")
|
||||||
|
|
||||||
|
model_folder = os.path.join(Data_Folder, "Model_Weights")
|
||||||
|
classes_filename = os.path.join(model_folder, "data_classes.txt")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
# surpress any inhereted default values
|
||||||
|
parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
|
||||||
|
"""
|
||||||
|
Command line options
|
||||||
|
"""
|
||||||
|
parser.add_argument(
|
||||||
|
"--VoTT_Folder",
|
||||||
|
type=str,
|
||||||
|
default=VoTT_Folder,
|
||||||
|
help="Absolute path to the exported files from the image tagging step with VoTT. Default is "
|
||||||
|
+ VoTT_Folder,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--VoTT_csv",
|
||||||
|
type=str,
|
||||||
|
default=VoTT_csv,
|
||||||
|
help="Absolute path to the *.csv file exported from VoTT. Default is "
|
||||||
|
+ VoTT_csv,
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--YOLO_filename",
|
||||||
|
type=str,
|
||||||
|
default=YOLO_filename,
|
||||||
|
help="Absolute path to the file where the annotations in YOLO format should be saved. Default is "
|
||||||
|
+ YOLO_filename,
|
||||||
|
)
|
||||||
|
|
||||||
|
FLAGS = parser.parse_args()
|
||||||
|
|
||||||
|
# Prepare the dataset for YOLO
|
||||||
|
multi_df = pd.read_csv(FLAGS.VoTT_csv)
|
||||||
|
labels = multi_df["label"].unique()
|
||||||
|
labeldict = dict(zip(labels, range(len(labels))))
|
||||||
|
multi_df.drop_duplicates(subset=None, keep="first", inplace=True)
|
||||||
|
train_path = FLAGS.VoTT_Folder
|
||||||
|
convert_vott_csv_to_yolo(
|
||||||
|
multi_df, labeldict, path=train_path, target_name=FLAGS.YOLO_filename
|
||||||
|
)
|
||||||
|
|
||||||
|
# Make classes file
|
||||||
|
file = open(classes_filename, "w")
|
||||||
|
|
||||||
|
# Sort Dict by Values
|
||||||
|
SortedLabelDict = sorted(labeldict.items(), key=lambda x: x[1])
|
||||||
|
for elem in SortedLabelDict:
|
||||||
|
file.write(elem[0] + "\n")
|
||||||
|
file.close()
|
||||||
@@ -0,0 +1,51 @@
|
|||||||
|
# TrainYourOwnYOLO: Image Annotation
|
||||||
|
|
||||||
|
## Dataset
|
||||||
|
To train the YOLO object detector on your own dataset, copy your training images to [`TrainYourOwnYOLO/Data/Source_Images/Training_Images`](/Data/Source_Images/Training_Images/). By default, this directory is pre-populated with 101 cat images. Feel free to delete all existing cat images to make your project cleaner.
|
||||||
|
|
||||||
|
### Creating a Dataset from Scratch
|
||||||
|
If you do not already have an image dataset, consider using a Chrome extension such as [Fatkun Batch Downloader](https://chrome.google.com/webstore/detail/fatkun-batch-download-ima/nnjjahlikiabnchcpehcpkdeckfgnohf?hl=en) which lets you search and download images from Google Images. For instance, you can build a fidget spinner detector by searching for images with fidget spinners.
|
||||||
|
|
||||||
|
## Annotation
|
||||||
|
To make our detector learn, we first need to feed it some good training examples. We use Microsoft's Visual Object Tagging Tool (VoTT) to manually label images in our training folder [`TrainYourOwnYOLO/Data/Source_Images/Training_Images`](/Data/Source_Images/Training_Images/). To achieve decent results annotate at least 100 images. For good results label at least 300 images and for great results label 1000+ images.
|
||||||
|
|
||||||
|
### Download VoTT
|
||||||
|
Head to VoTT [releases](https://github.com/Microsoft/VoTT/releases) and download and install the version for your operating system. Under `Assets` select the package for your operating system:
|
||||||
|
- `vott-2.x.x-darwin.dmg` for Mac users,
|
||||||
|
- `vott-2.x.x-win32.exe` for Windows users and
|
||||||
|
- `vott-2.x.x-linux.snap` for Linux users.
|
||||||
|
|
||||||
|
Installing `*.snap` files requires the snapd package manager which is available at [snapcraft.io](https://snapcraft.io/docs/installing-snapd).
|
||||||
|
|
||||||
|
### Create a New Project
|
||||||
|
|
||||||
|
Create a **New Project** and call it `Annotations`. It is highly recommended to use `Annotations` as your project name. If you like to use a different name for your project, you will have to modify the command line arguments of subsequent scripts accordingly.
|
||||||
|
|
||||||
|
Under **Source Connection** choose **Add Connection** and put `Images` as **Display Name**. Under **Provider** choose **Local File System** and select [`TrainYourOwnYOLO/Data/Source Images/Training_Images`](/Data/Source_Images/Training_Images) and then **Save Connection**. For **Target Connection** choose the same folder as for **Source Connection**. Hit **Save Project** to finish project creation.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### Export Settings
|
||||||
|
Navigate to **Export Settings** in the sidebar and then change the **Provider** to `Comma Separated Values (CSV)`, then hit **Save Export Settings**.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
|
||||||
|
### Labeling
|
||||||
|
First create a new tag on the right and give it a relevant tag name. In our example, we choose `Cat_Face`. Then draw bounding boxes around your objects. You can use the number key **1** to quickly assign the first tag to the current bounding box.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### Export Results
|
||||||
|
Once you have labeled enough images press **CRTL+E** to export the project. You should now see a folder called [`vott-csv-export`](/Data/Source_Images/Training_Images/vott-csv-export) in the [`Training_Images`](/Data/Source_Images/Training_Images) directory. Within that folder, you should see a `*.csv` file called [`Annotations-export.csv`](/Data/Source_Images/Training_Images/vott-csv-export/Annotations-export.csv) which contains file names and bounding box coordinates.
|
||||||
|
|
||||||
|
## Convert to YOLO Format
|
||||||
|
As a final step, convert the VoTT csv format to the YOLOv3 format. To do so, run the conversion script from within the [`TrainYourOwnYOLO/1_Image_Annotation`](/1_Image_Annotation/) folder:
|
||||||
|
|
||||||
|
```
|
||||||
|
python Convert_to_YOLO_format.py
|
||||||
|
```
|
||||||
|
The script generates two output files: [`data_train.txt`](/Data/Source_Images/Training_Images/vott-csv-export/data_train.txt) located in the [`TrainYourOwnYOLO/Data/Source_Images/Training_Images/vott-csv-export`](/Data/Source_Images/Training_Images/vott-csv-export) folder and [`data_classes.txt`](/Data/Model_Weights/data_classes.txt) located in the [`TrainYourOwnYOLO/Data/Model_Weights`](/Data/Model_Weights/) folder. To list available command line options run `python Convert_to_YOLO_format.py -h`.
|
||||||
|
|
||||||
|
### That's all for annotation!
|
||||||
|
Next, go to [`TrainYourOwnYOLO/2_Training`](/2_Training) to train your YOLOv3 detector.
|
||||||
|
Depois Largura: | Altura: | Tamanho: 40 KiB |
|
Depois Largura: | Altura: | Tamanho: 1.2 MiB |
|
Depois Largura: | Altura: | Tamanho: 15 MiB |
|
Depois Largura: | Altura: | Tamanho: 9.4 MiB |
|
Depois Largura: | Altura: | Tamanho: 2.4 MiB |
@@ -0,0 +1,15 @@
|
|||||||
|
# TrainYourOwnYOLO: Training on AWS
|
||||||
|
|
||||||
|
If your local machine does not have a GPU, training could take a very long time. To speed things up use an [AWS](https://aws.amazon.com/) GPU instance.
|
||||||
|
|
||||||
|
## Spinning up a GPU Instance
|
||||||
|
To spin up a GPU instance, go to **EC2** and select **Launch Instance**. Then go to **Deep Learning AMI (Ubuntu) Version xx.x** and hit **Select**. Under instance type, select **p2.xlarge** as the Instance Type. Click through to Step 4 `Add Storage` and add at least 10 GB more than the default value. Proceed by hitting **Review and Launch**.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
## Starting the Training
|
||||||
|
Connect to your instance and follow the same steps as on your local machine. Make sure that all your Source Images are in [`TrainYourOwnYOLO/Data/Source_Images`](/Data/Source_Images) and that both
|
||||||
|
- [`TrainYourOwnYOLO/Data/Source_Images/vott-csv-export/data_train.txt`](/Data/Source_Images/vott-csv-export/data_train.txt) and
|
||||||
|
- [`TrainYourOwnYOLO/Model_Weights/data_classes.txt`](/Data/Model_Weights/data_classes.txt)
|
||||||
|
|
||||||
|
are up-to-date.
|
||||||
|
Depois Largura: | Altura: | Tamanho: 2.9 MiB |
@@ -0,0 +1,46 @@
|
|||||||
|
import os
|
||||||
|
import subprocess
|
||||||
|
import time
|
||||||
|
import sys
|
||||||
|
import argparse
|
||||||
|
import requests
|
||||||
|
import progressbar
|
||||||
|
|
||||||
|
FLAGS = None
|
||||||
|
|
||||||
|
root_folder = os.path.dirname(os.path.abspath(__file__))
|
||||||
|
download_folder = os.path.join(root_folder, "src", "keras_yolo3")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
# Delete all default flags
|
||||||
|
parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
|
||||||
|
"""
|
||||||
|
Command line options
|
||||||
|
"""
|
||||||
|
parser.add_argument(
|
||||||
|
"--download_folder",
|
||||||
|
type=str,
|
||||||
|
default=download_folder,
|
||||||
|
help="Folder to download weights to. Default is " + download_folder,
|
||||||
|
)
|
||||||
|
|
||||||
|
FLAGS = parser.parse_args()
|
||||||
|
|
||||||
|
url = "https://pjreddie.com/media/files/yolov3.weights"
|
||||||
|
r = requests.get(url, stream=True)
|
||||||
|
|
||||||
|
f = open(os.path.join(download_folder, "yolov3.weights"), "wb")
|
||||||
|
file_size = int(r.headers.get("content-length"))
|
||||||
|
chunk = 100
|
||||||
|
num_bars = file_size // chunk
|
||||||
|
bar = progressbar.ProgressBar(maxval=num_bars).start()
|
||||||
|
i = 0
|
||||||
|
for chunk in r.iter_content(chunk):
|
||||||
|
f.write(chunk)
|
||||||
|
bar.update(i)
|
||||||
|
i += 1
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
call_string = "python convert.py yolov3.cfg yolov3.weights yolo.h5"
|
||||||
|
|
||||||
|
subprocess.call(call_string, shell=True, cwd=download_folder)
|
||||||
@@ -0,0 +1,26 @@
|
|||||||
|
# TrainYourOwnYOLO: Training
|
||||||
|
|
||||||
|
## Training
|
||||||
|
Using the training images located in [`TrainYourOwnYOLO/Data/Source_Images/Training_Images`](/Data/Source_Images/Training_Images) and the annotation file [`data_train.txt`](/Data/Source_Images/Training_Images/vott-csv-export) which we have created in the [previous step](/1_Image_Annotation/) we are now ready to train our YOLOv3 detector.
|
||||||
|
|
||||||
|
## Download and Convert Pre-Trained Weights
|
||||||
|
Before getting started download the pre-trained YOLOv3 weights and convert them to the keras format. To run both steps run the download and conversion script from within the [`TrainYourOwnYOLO/2_Training`](/2_Training/) directory:
|
||||||
|
|
||||||
|
```
|
||||||
|
python Download_and_Convert_YOLO_weights.py
|
||||||
|
```
|
||||||
|
To list available command line options run `python Download_and_Convert_YOLO_weights.py -h`.
|
||||||
|
|
||||||
|
The weights are pre-trained on the [ImageNet 1000 dataset](http://image-net.org/challenges/LSVRC/2015/index) and thus work well for object detection tasks that are very similar to the types of images and objects in the ImageNet 1000 dataset.
|
||||||
|
|
||||||
|
## Train YOLOv3 Detector
|
||||||
|
To start the training, run the training script from within the [`TrainYourOwnYOLO/2_Training`](/2_Training/) directory:
|
||||||
|
```
|
||||||
|
python Train_YOLO.py
|
||||||
|
```
|
||||||
|
Depending on your set-up, this process can take a few minutes to a few hours. The final weights are saved in [`TrainYourOwnYOLO/Data/Model_weights`](/Data/Model_weights). To list available command line options run `python Train_YOLO.py -h`.
|
||||||
|
|
||||||
|
If training is too slow on your local machine, consider using cloud computing services such as AWS to speed things up. To learn more about training on AWS navigate to [`TrainYourOwnYOLO/2_Training/AWS`](/2_Training/AWS).
|
||||||
|
|
||||||
|
### That's all for training!
|
||||||
|
Next, go to [`TrainYourOwnYOLO/3_Inference`](/3_Inference) to test your YOLO detector on new images!
|
||||||
@@ -0,0 +1,283 @@
|
|||||||
|
"""
|
||||||
|
MODIFIED FROM keras-yolo3 PACKAGE, https://github.com/qqwweee/keras-yolo3
|
||||||
|
Retrain the YOLO model for your own dataset.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import argparse
|
||||||
|
|
||||||
|
|
||||||
|
def get_parent_dir(n=1):
|
||||||
|
""" returns the n-th parent dicrectory of the current
|
||||||
|
working directory """
|
||||||
|
current_path = os.path.dirname(os.path.abspath(__file__))
|
||||||
|
for k in range(n):
|
||||||
|
current_path = os.path.dirname(current_path)
|
||||||
|
return current_path
|
||||||
|
|
||||||
|
|
||||||
|
src_path = os.path.join(get_parent_dir(0), "src")
|
||||||
|
sys.path.append(src_path)
|
||||||
|
|
||||||
|
utils_path = os.path.join(get_parent_dir(1), "Utils")
|
||||||
|
sys.path.append(utils_path)
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import keras.backend as K
|
||||||
|
from keras.layers import Input, Lambda
|
||||||
|
from keras.models import Model
|
||||||
|
from keras.optimizers import Adam
|
||||||
|
from keras.callbacks import (
|
||||||
|
TensorBoard,
|
||||||
|
ModelCheckpoint,
|
||||||
|
ReduceLROnPlateau,
|
||||||
|
EarlyStopping,
|
||||||
|
)
|
||||||
|
from keras_yolo3.yolo3.model import (
|
||||||
|
preprocess_true_boxes,
|
||||||
|
yolo_body,
|
||||||
|
tiny_yolo_body,
|
||||||
|
yolo_loss,
|
||||||
|
)
|
||||||
|
from keras_yolo3.yolo3.utils import get_random_data
|
||||||
|
from PIL import Image
|
||||||
|
from time import time
|
||||||
|
import pickle
|
||||||
|
|
||||||
|
from Train_Utils import (
|
||||||
|
get_classes,
|
||||||
|
get_anchors,
|
||||||
|
create_model,
|
||||||
|
create_tiny_model,
|
||||||
|
data_generator,
|
||||||
|
data_generator_wrapper,
|
||||||
|
ChangeToOtherMachine,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
keras_path = os.path.join(src_path, "keras_yolo3")
|
||||||
|
Data_Folder = os.path.join(get_parent_dir(1), "Data")
|
||||||
|
Image_Folder = os.path.join(Data_Folder, "Source_Images", "Training_Images")
|
||||||
|
VoTT_Folder = os.path.join(Image_Folder, "vott-csv-export")
|
||||||
|
YOLO_filename = os.path.join(VoTT_Folder, "data_train.txt")
|
||||||
|
|
||||||
|
Model_Folder = os.path.join(Data_Folder, "Model_Weights")
|
||||||
|
YOLO_classname = os.path.join(Model_Folder, "data_classes.txt")
|
||||||
|
|
||||||
|
log_dir = Model_Folder
|
||||||
|
anchors_path = os.path.join(keras_path, "model_data", "yolo_anchors.txt")
|
||||||
|
weights_path = os.path.join(keras_path, "yolo.h5")
|
||||||
|
|
||||||
|
FLAGS = None
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
# Delete all default flags
|
||||||
|
parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
|
||||||
|
"""
|
||||||
|
Command line options
|
||||||
|
"""
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--annotation_file",
|
||||||
|
type=str,
|
||||||
|
default=YOLO_filename,
|
||||||
|
help="Path to annotation file for Yolo. Default is " + YOLO_filename,
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--classes_file",
|
||||||
|
type=str,
|
||||||
|
default=YOLO_classname,
|
||||||
|
help="Path to YOLO classnames. Default is " + YOLO_classname,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--log_dir",
|
||||||
|
type=str,
|
||||||
|
default=log_dir,
|
||||||
|
help="Folder to save training logs and trained weights to. Default is "
|
||||||
|
+ log_dir,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--anchors_path",
|
||||||
|
type=str,
|
||||||
|
default=anchors_path,
|
||||||
|
help="Path to YOLO anchors. Default is " + anchors_path,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--weights_path",
|
||||||
|
type=str,
|
||||||
|
default=weights_path,
|
||||||
|
help="Path to pre-trained YOLO weights. Default is " + weights_path,
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--val_split",
|
||||||
|
type=float,
|
||||||
|
default=0.1,
|
||||||
|
help="Percentage of training set to be used for validation. Default is 10%.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--is_tiny",
|
||||||
|
default=False,
|
||||||
|
action="store_true",
|
||||||
|
help="Use the tiny Yolo version for better performance and less accuracy. Default is False.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--random_seed",
|
||||||
|
type=float,
|
||||||
|
default=None,
|
||||||
|
help="Random seed value to make script deterministic. Default is 'None', i.e. non-deterministic.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--epochs",
|
||||||
|
type=float,
|
||||||
|
default=51,
|
||||||
|
help="Number of epochs for training last layers and number of epochs for fine-tuning layers. Default is 51.",
|
||||||
|
)
|
||||||
|
|
||||||
|
FLAGS = parser.parse_args()
|
||||||
|
|
||||||
|
np.random.seed(FLAGS.random_seed)
|
||||||
|
|
||||||
|
log_dir = FLAGS.log_dir
|
||||||
|
|
||||||
|
class_names = get_classes(FLAGS.classes_file)
|
||||||
|
num_classes = len(class_names)
|
||||||
|
anchors = get_anchors(FLAGS.anchors_path)
|
||||||
|
weights_path = FLAGS.weights_path
|
||||||
|
|
||||||
|
input_shape = (416, 416) # multiple of 32, height, width
|
||||||
|
epoch1, epoch2 = FLAGS.epochs, FLAGS.epochs
|
||||||
|
|
||||||
|
is_tiny_version = len(anchors) == 6 # default setting
|
||||||
|
if FLAGS.is_tiny:
|
||||||
|
model = create_tiny_model(
|
||||||
|
input_shape, anchors, num_classes, freeze_body=2, weights_path=weights_path
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
model = create_model(
|
||||||
|
input_shape, anchors, num_classes, freeze_body=2, weights_path=weights_path
|
||||||
|
) # make sure you know what you freeze
|
||||||
|
|
||||||
|
log_dir_time = os.path.join(log_dir, "{}".format(int(time())))
|
||||||
|
logging = TensorBoard(log_dir=log_dir_time)
|
||||||
|
checkpoint = ModelCheckpoint(
|
||||||
|
os.path.join(log_dir, "checkpoint.h5"),
|
||||||
|
monitor="val_loss",
|
||||||
|
save_weights_only=True,
|
||||||
|
save_best_only=True,
|
||||||
|
period=5,
|
||||||
|
)
|
||||||
|
reduce_lr = ReduceLROnPlateau(monitor="val_loss", factor=0.1, patience=3, verbose=1)
|
||||||
|
early_stopping = EarlyStopping(
|
||||||
|
monitor="val_loss", min_delta=0, patience=10, verbose=1
|
||||||
|
)
|
||||||
|
|
||||||
|
val_split = FLAGS.val_split
|
||||||
|
with open(FLAGS.annotation_file) as f:
|
||||||
|
lines = f.readlines()
|
||||||
|
|
||||||
|
# This step makes sure that the path names correspond to the local machine
|
||||||
|
# This is important if annotation and training are done on different machines (e.g. training on AWS)
|
||||||
|
lines = ChangeToOtherMachine(lines, remote_machine="")
|
||||||
|
np.random.shuffle(lines)
|
||||||
|
num_val = int(len(lines) * val_split)
|
||||||
|
num_train = len(lines) - num_val
|
||||||
|
|
||||||
|
# Train with frozen layers first, to get a stable loss.
|
||||||
|
# Adjust num epochs to your dataset. This step is enough to obtain a decent model.
|
||||||
|
if True:
|
||||||
|
model.compile(
|
||||||
|
optimizer=Adam(lr=1e-3),
|
||||||
|
loss={
|
||||||
|
# use custom yolo_loss Lambda layer.
|
||||||
|
"yolo_loss": lambda y_true, y_pred: y_pred
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
batch_size = 32
|
||||||
|
print(
|
||||||
|
"Train on {} samples, val on {} samples, with batch size {}.".format(
|
||||||
|
num_train, num_val, batch_size
|
||||||
|
)
|
||||||
|
)
|
||||||
|
history = model.fit_generator(
|
||||||
|
data_generator_wrapper(
|
||||||
|
lines[:num_train], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
steps_per_epoch=max(1, num_train // batch_size),
|
||||||
|
validation_data=data_generator_wrapper(
|
||||||
|
lines[num_train:], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
validation_steps=max(1, num_val // batch_size),
|
||||||
|
epochs=epoch1,
|
||||||
|
initial_epoch=0,
|
||||||
|
callbacks=[logging, checkpoint],
|
||||||
|
)
|
||||||
|
model.save_weights(os.path.join(log_dir, "trained_weights_stage_1.h5"))
|
||||||
|
|
||||||
|
step1_train_loss = history.history["loss"]
|
||||||
|
|
||||||
|
file = open(os.path.join(log_dir_time, "step1_loss.npy"), "w")
|
||||||
|
with open(os.path.join(log_dir_time, "step1_loss.npy"), "w") as f:
|
||||||
|
for item in step1_train_loss:
|
||||||
|
f.write("%s\n" % item)
|
||||||
|
file.close()
|
||||||
|
|
||||||
|
step1_val_loss = np.array(history.history["val_loss"])
|
||||||
|
|
||||||
|
file = open(os.path.join(log_dir_time, "step1_val_loss.npy"), "w")
|
||||||
|
with open(os.path.join(log_dir_time, "step1_val_loss.npy"), "w") as f:
|
||||||
|
for item in step1_val_loss:
|
||||||
|
f.write("%s\n" % item)
|
||||||
|
file.close()
|
||||||
|
|
||||||
|
# Unfreeze and continue training, to fine-tune.
|
||||||
|
# Train longer if the result is unsatisfactory.
|
||||||
|
if True:
|
||||||
|
for i in range(len(model.layers)):
|
||||||
|
model.layers[i].trainable = True
|
||||||
|
model.compile(
|
||||||
|
optimizer=Adam(lr=1e-4), loss={"yolo_loss": lambda y_true, y_pred: y_pred}
|
||||||
|
) # recompile to apply the change
|
||||||
|
print("Unfreeze all layers.")
|
||||||
|
|
||||||
|
batch_size = (
|
||||||
|
4 # note that more GPU memory is required after unfreezing the body
|
||||||
|
)
|
||||||
|
print(
|
||||||
|
"Train on {} samples, val on {} samples, with batch size {}.".format(
|
||||||
|
num_train, num_val, batch_size
|
||||||
|
)
|
||||||
|
)
|
||||||
|
history = model.fit_generator(
|
||||||
|
data_generator_wrapper(
|
||||||
|
lines[:num_train], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
steps_per_epoch=max(1, num_train // batch_size),
|
||||||
|
validation_data=data_generator_wrapper(
|
||||||
|
lines[num_train:], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
validation_steps=max(1, num_val // batch_size),
|
||||||
|
epochs=epoch1 + epoch2,
|
||||||
|
initial_epoch=epoch1,
|
||||||
|
callbacks=[logging, checkpoint, reduce_lr, early_stopping],
|
||||||
|
)
|
||||||
|
model.save_weights(os.path.join(log_dir, "trained_weights_final.h5"))
|
||||||
|
step2_train_loss = history.history["loss"]
|
||||||
|
|
||||||
|
file = open(os.path.join(log_dir_time, "step2_loss.npy"), "w")
|
||||||
|
with open(os.path.join(log_dir_time, "step2_loss.npy"), "w") as f:
|
||||||
|
for item in step2_train_loss:
|
||||||
|
f.write("%s\n" % item)
|
||||||
|
file.close()
|
||||||
|
|
||||||
|
step2_val_loss = np.array(history.history["val_loss"])
|
||||||
|
|
||||||
|
file = open(os.path.join(log_dir_time, "step2_val_loss.npy"), "w")
|
||||||
|
with open(os.path.join(log_dir_time, "step2_val_loss.npy"), "w") as f:
|
||||||
|
for item in step2_val_loss:
|
||||||
|
f.write("%s\n" % item)
|
||||||
|
file.close()
|
||||||
@@ -0,0 +1,3 @@
|
|||||||
|
# Keras YOLOv3
|
||||||
|
|
||||||
|
This part of the repo is a fork of [**qqwweee/keras-yolo3**](https://github.com/qqwweee/keras-yolo3): **A Keras implementation of YOLOv3 (Tensorflow backend)**.
|
||||||
@@ -0,0 +1,21 @@
|
|||||||
|
MIT License
|
||||||
|
|
||||||
|
Copyright (c) 2018 qqwweee
|
||||||
|
|
||||||
|
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||||
|
of this software and associated documentation files (the "Software"), to deal
|
||||||
|
in the Software without restriction, including without limitation the rights
|
||||||
|
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||||
|
copies of the Software, and to permit persons to whom the Software is
|
||||||
|
furnished to do so, subject to the following conditions:
|
||||||
|
|
||||||
|
The above copyright notice and this permission notice shall be included in all
|
||||||
|
copies or substantial portions of the Software.
|
||||||
|
|
||||||
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||||
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||||
|
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||||
|
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||||
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||||
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||||
|
SOFTWARE.
|
||||||
@@ -0,0 +1,99 @@
|
|||||||
|
# keras-yolo3
|
||||||
|
|
||||||
|
[](LICENSE)
|
||||||
|
|
||||||
|
## Introduction
|
||||||
|
|
||||||
|
A Keras implementation of YOLOv3 (Tensorflow backend) inspired by [allanzelener/YAD2K](https://github.com/allanzelener/YAD2K).
|
||||||
|
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
1. Download YOLOv3 weights from [YOLO website](http://pjreddie.com/darknet/yolo/).
|
||||||
|
2. Convert the Darknet YOLO model to a Keras model.
|
||||||
|
3. Run YOLO detection.
|
||||||
|
|
||||||
|
```
|
||||||
|
wget https://pjreddie.com/media/files/yolov3.weights
|
||||||
|
python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5
|
||||||
|
python yolo_video.py [OPTIONS...] --image, for image detection mode, OR
|
||||||
|
python yolo_video.py [video_path] [output_path (optional)]
|
||||||
|
```
|
||||||
|
|
||||||
|
For Tiny YOLOv3, just do in a similar way, just specify model path and anchor path with `--model model_file` and `--anchors anchor_file`.
|
||||||
|
|
||||||
|
### Usage
|
||||||
|
Use --help to see usage of yolo_video.py:
|
||||||
|
```
|
||||||
|
usage: yolo_video.py [-h] [--model MODEL] [--anchors ANCHORS]
|
||||||
|
[--classes CLASSES] [--gpu_num GPU_NUM] [--image]
|
||||||
|
[--input] [--output]
|
||||||
|
|
||||||
|
positional arguments:
|
||||||
|
--input Video input path
|
||||||
|
--output Video output path
|
||||||
|
|
||||||
|
optional arguments:
|
||||||
|
-h, --help show this help message and exit
|
||||||
|
--model MODEL path to model weight file, default model_data/yolo.h5
|
||||||
|
--anchors ANCHORS path to anchor definitions, default
|
||||||
|
model_data/yolo_anchors.txt
|
||||||
|
--classes CLASSES path to class definitions, default
|
||||||
|
model_data/coco_classes.txt
|
||||||
|
--gpu_num GPU_NUM Number of GPU to use, default 1
|
||||||
|
--image Image detection mode, will ignore all positional arguments
|
||||||
|
```
|
||||||
|
---
|
||||||
|
|
||||||
|
4. MultiGPU usage: use `--gpu_num N` to use N GPUs. It is passed to the [Keras multi_gpu_model()](https://keras.io/utils/#multi_gpu_model).
|
||||||
|
|
||||||
|
## Training
|
||||||
|
|
||||||
|
1. Generate your own annotation file and class names file.
|
||||||
|
One row for one image;
|
||||||
|
Row format: `image_file_path box1 box2 ... boxN`;
|
||||||
|
Box format: `x_min,y_min,x_max,y_max,class_id` (no space).
|
||||||
|
For VOC dataset, try `python voc_annotation.py`
|
||||||
|
Here is an example:
|
||||||
|
```
|
||||||
|
path/to/img1.jpg 50,100,150,200,0 30,50,200,120,3
|
||||||
|
path/to/img2.jpg 120,300,250,600,2
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Make sure you have run `python convert.py -w yolov3.cfg yolov3.weights model_data/yolo_weights.h5`
|
||||||
|
The file model_data/yolo_weights.h5 is used to load pretrained weights.
|
||||||
|
|
||||||
|
3. Modify train.py and start training.
|
||||||
|
`python train.py`
|
||||||
|
Use your trained weights or checkpoint weights with command line option `--model model_file` when using yolo_video.py
|
||||||
|
Remember to modify class path or anchor path, with `--classes class_file` and `--anchors anchor_file`.
|
||||||
|
|
||||||
|
If you want to use original pretrained weights for YOLOv3:
|
||||||
|
1. `wget https://pjreddie.com/media/files/darknet53.conv.74`
|
||||||
|
2. rename it as darknet53.weights
|
||||||
|
3. `python convert.py -w darknet53.cfg darknet53.weights model_data/darknet53_weights.h5`
|
||||||
|
4. use model_data/darknet53_weights.h5 in train.py
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Some issues to know
|
||||||
|
|
||||||
|
1. The test environment is
|
||||||
|
- Python 3.5.2
|
||||||
|
- Keras 2.1.5
|
||||||
|
- tensorflow 1.6.0
|
||||||
|
|
||||||
|
2. Default anchors are used. If you use your own anchors, probably some changes are needed.
|
||||||
|
|
||||||
|
3. The inference result is not totally the same as Darknet but the difference is small.
|
||||||
|
|
||||||
|
4. The speed is slower than Darknet. Replacing PIL with opencv may help a little.
|
||||||
|
|
||||||
|
5. Always load pretrained weights and freeze layers in the first stage of training. Or try Darknet training. It's OK if there is a mismatch warning.
|
||||||
|
|
||||||
|
6. The training strategy is for reference only. Adjust it according to your dataset and your goal. And add further strategy if needed.
|
||||||
|
|
||||||
|
7. For speeding up the training process with frozen layers train_bottleneck.py can be used. It will compute the bottleneck features of the frozen model first and then only trains the last layers. This makes training on CPU possible in a reasonable time. See [this](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html) for more information on bottleneck features.
|
||||||
@@ -0,0 +1,49 @@
|
|||||||
|
import json
|
||||||
|
from collections import defaultdict
|
||||||
|
|
||||||
|
name_box_id = defaultdict(list)
|
||||||
|
id_name = dict()
|
||||||
|
f = open("mscoco2017/annotations/instances_train2017.json", encoding="utf-8")
|
||||||
|
data = json.load(f)
|
||||||
|
|
||||||
|
annotations = data["annotations"]
|
||||||
|
for ant in annotations:
|
||||||
|
id = ant["image_id"]
|
||||||
|
name = "mscoco2017/train2017/%012d.jpg" % id
|
||||||
|
cat = ant["category_id"]
|
||||||
|
|
||||||
|
if cat >= 1 and cat <= 11:
|
||||||
|
cat = cat - 1
|
||||||
|
elif cat >= 13 and cat <= 25:
|
||||||
|
cat = cat - 2
|
||||||
|
elif cat >= 27 and cat <= 28:
|
||||||
|
cat = cat - 3
|
||||||
|
elif cat >= 31 and cat <= 44:
|
||||||
|
cat = cat - 5
|
||||||
|
elif cat >= 46 and cat <= 65:
|
||||||
|
cat = cat - 6
|
||||||
|
elif cat == 67:
|
||||||
|
cat = cat - 7
|
||||||
|
elif cat == 70:
|
||||||
|
cat = cat - 9
|
||||||
|
elif cat >= 72 and cat <= 82:
|
||||||
|
cat = cat - 10
|
||||||
|
elif cat >= 84 and cat <= 90:
|
||||||
|
cat = cat - 11
|
||||||
|
|
||||||
|
name_box_id[name].append([ant["bbox"], cat])
|
||||||
|
|
||||||
|
f = open("train.txt", "w")
|
||||||
|
for key in name_box_id.keys():
|
||||||
|
f.write(key)
|
||||||
|
box_infos = name_box_id[key]
|
||||||
|
for info in box_infos:
|
||||||
|
x_min = int(info[0][0])
|
||||||
|
y_min = int(info[0][1])
|
||||||
|
x_max = x_min + int(info[0][2])
|
||||||
|
y_max = y_min + int(info[0][3])
|
||||||
|
|
||||||
|
box_info = " %d,%d,%d,%d,%d" % (x_min, y_min, x_max, y_max, int(info[1]))
|
||||||
|
f.write(box_info)
|
||||||
|
f.write("\n")
|
||||||
|
f.close()
|
||||||
@@ -0,0 +1,286 @@
|
|||||||
|
#! /usr/bin/env python
|
||||||
|
"""
|
||||||
|
Reads Darknet config and weights and creates Keras model with TF backend.
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import configparser
|
||||||
|
import io
|
||||||
|
import os
|
||||||
|
from collections import defaultdict
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
from keras import backend as K
|
||||||
|
from keras.layers import (
|
||||||
|
Conv2D,
|
||||||
|
Input,
|
||||||
|
ZeroPadding2D,
|
||||||
|
Add,
|
||||||
|
UpSampling2D,
|
||||||
|
MaxPooling2D,
|
||||||
|
Concatenate,
|
||||||
|
)
|
||||||
|
from keras.layers.advanced_activations import LeakyReLU
|
||||||
|
from keras.layers.normalization import BatchNormalization
|
||||||
|
from keras.models import Model
|
||||||
|
from keras.regularizers import l2
|
||||||
|
from keras.utils.vis_utils import plot_model as plot
|
||||||
|
|
||||||
|
|
||||||
|
parser = argparse.ArgumentParser(description="Darknet To Keras Converter.")
|
||||||
|
parser.add_argument("config_path", help="Path to Darknet cfg file.")
|
||||||
|
parser.add_argument("weights_path", help="Path to Darknet weights file.")
|
||||||
|
parser.add_argument("output_path", help="Path to output Keras model file.")
|
||||||
|
parser.add_argument(
|
||||||
|
"-p",
|
||||||
|
"--plot_model",
|
||||||
|
help="Plot generated Keras model and save as image.",
|
||||||
|
action="store_true",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"-w",
|
||||||
|
"--weights_only",
|
||||||
|
help="Save as Keras weights file instead of model file.",
|
||||||
|
action="store_true",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def unique_config_sections(config_file):
|
||||||
|
"""Convert all config sections to have unique names.
|
||||||
|
|
||||||
|
Adds unique suffixes to config sections for compability with configparser.
|
||||||
|
"""
|
||||||
|
section_counters = defaultdict(int)
|
||||||
|
output_stream = io.StringIO()
|
||||||
|
with open(config_file) as fin:
|
||||||
|
for line in fin:
|
||||||
|
if line.startswith("["):
|
||||||
|
section = line.strip().strip("[]")
|
||||||
|
_section = section + "_" + str(section_counters[section])
|
||||||
|
section_counters[section] += 1
|
||||||
|
line = line.replace(section, _section)
|
||||||
|
output_stream.write(line)
|
||||||
|
output_stream.seek(0)
|
||||||
|
return output_stream
|
||||||
|
|
||||||
|
|
||||||
|
# %%
|
||||||
|
def _main(args):
|
||||||
|
config_path = os.path.expanduser(args.config_path)
|
||||||
|
weights_path = os.path.expanduser(args.weights_path)
|
||||||
|
assert config_path.endswith(".cfg"), "{} is not a .cfg file".format(config_path)
|
||||||
|
assert weights_path.endswith(".weights"), "{} is not a .weights file".format(
|
||||||
|
weights_path
|
||||||
|
)
|
||||||
|
|
||||||
|
output_path = os.path.expanduser(args.output_path)
|
||||||
|
assert output_path.endswith(".h5"), "output path {} is not a .h5 file".format(
|
||||||
|
output_path
|
||||||
|
)
|
||||||
|
output_root = os.path.splitext(output_path)[0]
|
||||||
|
|
||||||
|
# Load weights and config.
|
||||||
|
print("Loading weights.")
|
||||||
|
weights_file = open(weights_path, "rb")
|
||||||
|
major, minor, revision = np.ndarray(
|
||||||
|
shape=(3,), dtype="int32", buffer=weights_file.read(12)
|
||||||
|
)
|
||||||
|
if (major * 10 + minor) >= 2 and major < 1000 and minor < 1000:
|
||||||
|
seen = np.ndarray(shape=(1,), dtype="int64", buffer=weights_file.read(8))
|
||||||
|
else:
|
||||||
|
seen = np.ndarray(shape=(1,), dtype="int32", buffer=weights_file.read(4))
|
||||||
|
print("Weights Header: ", major, minor, revision, seen)
|
||||||
|
|
||||||
|
print("Parsing Darknet config.")
|
||||||
|
unique_config_file = unique_config_sections(config_path)
|
||||||
|
cfg_parser = configparser.ConfigParser()
|
||||||
|
cfg_parser.read_file(unique_config_file)
|
||||||
|
|
||||||
|
print("Creating Keras model.")
|
||||||
|
input_layer = Input(shape=(None, None, 3))
|
||||||
|
prev_layer = input_layer
|
||||||
|
all_layers = []
|
||||||
|
|
||||||
|
weight_decay = (
|
||||||
|
float(cfg_parser["net_0"]["decay"])
|
||||||
|
if "net_0" in cfg_parser.sections()
|
||||||
|
else 5e-4
|
||||||
|
)
|
||||||
|
count = 0
|
||||||
|
out_index = []
|
||||||
|
for section in cfg_parser.sections():
|
||||||
|
print("Parsing section {}".format(section))
|
||||||
|
if section.startswith("convolutional"):
|
||||||
|
filters = int(cfg_parser[section]["filters"])
|
||||||
|
size = int(cfg_parser[section]["size"])
|
||||||
|
stride = int(cfg_parser[section]["stride"])
|
||||||
|
pad = int(cfg_parser[section]["pad"])
|
||||||
|
activation = cfg_parser[section]["activation"]
|
||||||
|
batch_normalize = "batch_normalize" in cfg_parser[section]
|
||||||
|
|
||||||
|
padding = "same" if pad == 1 and stride == 1 else "valid"
|
||||||
|
|
||||||
|
# Setting weights.
|
||||||
|
# Darknet serializes convolutional weights as:
|
||||||
|
# [bias/beta, [gamma, mean, variance], conv_weights]
|
||||||
|
prev_layer_shape = K.int_shape(prev_layer)
|
||||||
|
|
||||||
|
weights_shape = (size, size, prev_layer_shape[-1], filters)
|
||||||
|
darknet_w_shape = (filters, weights_shape[2], size, size)
|
||||||
|
weights_size = np.product(weights_shape)
|
||||||
|
|
||||||
|
print(
|
||||||
|
"conv2d", "bn" if batch_normalize else " ", activation, weights_shape
|
||||||
|
)
|
||||||
|
|
||||||
|
conv_bias = np.ndarray(
|
||||||
|
shape=(filters,), dtype="float32", buffer=weights_file.read(filters * 4)
|
||||||
|
)
|
||||||
|
count += filters
|
||||||
|
|
||||||
|
if batch_normalize:
|
||||||
|
bn_weights = np.ndarray(
|
||||||
|
shape=(3, filters),
|
||||||
|
dtype="float32",
|
||||||
|
buffer=weights_file.read(filters * 12),
|
||||||
|
)
|
||||||
|
count += 3 * filters
|
||||||
|
|
||||||
|
bn_weight_list = [
|
||||||
|
bn_weights[0], # scale gamma
|
||||||
|
conv_bias, # shift beta
|
||||||
|
bn_weights[1], # running mean
|
||||||
|
bn_weights[2], # running var
|
||||||
|
]
|
||||||
|
|
||||||
|
conv_weights = np.ndarray(
|
||||||
|
shape=darknet_w_shape,
|
||||||
|
dtype="float32",
|
||||||
|
buffer=weights_file.read(weights_size * 4),
|
||||||
|
)
|
||||||
|
count += weights_size
|
||||||
|
|
||||||
|
# DarkNet conv_weights are serialized Caffe-style:
|
||||||
|
# (out_dim, in_dim, height, width)
|
||||||
|
# We would like to set these to Tensorflow order:
|
||||||
|
# (height, width, in_dim, out_dim)
|
||||||
|
conv_weights = np.transpose(conv_weights, [2, 3, 1, 0])
|
||||||
|
conv_weights = (
|
||||||
|
[conv_weights] if batch_normalize else [conv_weights, conv_bias]
|
||||||
|
)
|
||||||
|
|
||||||
|
# Handle activation.
|
||||||
|
act_fn = None
|
||||||
|
if activation == "leaky":
|
||||||
|
pass # Add advanced activation later.
|
||||||
|
elif activation != "linear":
|
||||||
|
raise ValueError(
|
||||||
|
"Unknown activation function `{}` in section {}".format(
|
||||||
|
activation, section
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create Conv2D layer
|
||||||
|
if stride > 1:
|
||||||
|
# Darknet uses left and top padding instead of 'same' mode
|
||||||
|
prev_layer = ZeroPadding2D(((1, 0), (1, 0)))(prev_layer)
|
||||||
|
conv_layer = (
|
||||||
|
Conv2D(
|
||||||
|
filters,
|
||||||
|
(size, size),
|
||||||
|
strides=(stride, stride),
|
||||||
|
kernel_regularizer=l2(weight_decay),
|
||||||
|
use_bias=not batch_normalize,
|
||||||
|
weights=conv_weights,
|
||||||
|
activation=act_fn,
|
||||||
|
padding=padding,
|
||||||
|
)
|
||||||
|
)(prev_layer)
|
||||||
|
|
||||||
|
if batch_normalize:
|
||||||
|
conv_layer = (BatchNormalization(weights=bn_weight_list))(conv_layer)
|
||||||
|
prev_layer = conv_layer
|
||||||
|
|
||||||
|
if activation == "linear":
|
||||||
|
all_layers.append(prev_layer)
|
||||||
|
elif activation == "leaky":
|
||||||
|
act_layer = LeakyReLU(alpha=0.1)(prev_layer)
|
||||||
|
prev_layer = act_layer
|
||||||
|
all_layers.append(act_layer)
|
||||||
|
|
||||||
|
elif section.startswith("route"):
|
||||||
|
ids = [int(i) for i in cfg_parser[section]["layers"].split(",")]
|
||||||
|
layers = [all_layers[i] for i in ids]
|
||||||
|
if len(layers) > 1:
|
||||||
|
print("Concatenating route layers:", layers)
|
||||||
|
concatenate_layer = Concatenate()(layers)
|
||||||
|
all_layers.append(concatenate_layer)
|
||||||
|
prev_layer = concatenate_layer
|
||||||
|
else:
|
||||||
|
skip_layer = layers[0] # only one layer to route
|
||||||
|
all_layers.append(skip_layer)
|
||||||
|
prev_layer = skip_layer
|
||||||
|
|
||||||
|
elif section.startswith("maxpool"):
|
||||||
|
size = int(cfg_parser[section]["size"])
|
||||||
|
stride = int(cfg_parser[section]["stride"])
|
||||||
|
all_layers.append(
|
||||||
|
MaxPooling2D(
|
||||||
|
pool_size=(size, size), strides=(stride, stride), padding="same"
|
||||||
|
)(prev_layer)
|
||||||
|
)
|
||||||
|
prev_layer = all_layers[-1]
|
||||||
|
|
||||||
|
elif section.startswith("shortcut"):
|
||||||
|
index = int(cfg_parser[section]["from"])
|
||||||
|
activation = cfg_parser[section]["activation"]
|
||||||
|
assert activation == "linear", "Only linear activation supported."
|
||||||
|
all_layers.append(Add()([all_layers[index], prev_layer]))
|
||||||
|
prev_layer = all_layers[-1]
|
||||||
|
|
||||||
|
elif section.startswith("upsample"):
|
||||||
|
stride = int(cfg_parser[section]["stride"])
|
||||||
|
assert stride == 2, "Only stride=2 supported."
|
||||||
|
all_layers.append(UpSampling2D(stride)(prev_layer))
|
||||||
|
prev_layer = all_layers[-1]
|
||||||
|
|
||||||
|
elif section.startswith("yolo"):
|
||||||
|
out_index.append(len(all_layers) - 1)
|
||||||
|
all_layers.append(None)
|
||||||
|
prev_layer = all_layers[-1]
|
||||||
|
|
||||||
|
elif section.startswith("net"):
|
||||||
|
pass
|
||||||
|
|
||||||
|
else:
|
||||||
|
raise ValueError("Unsupported section header type: {}".format(section))
|
||||||
|
|
||||||
|
# Create and save model.
|
||||||
|
if len(out_index) == 0:
|
||||||
|
out_index.append(len(all_layers) - 1)
|
||||||
|
model = Model(inputs=input_layer, outputs=[all_layers[i] for i in out_index])
|
||||||
|
print(model.summary())
|
||||||
|
if args.weights_only:
|
||||||
|
model.save_weights("{}".format(output_path))
|
||||||
|
print("Saved Keras weights to {}".format(output_path))
|
||||||
|
else:
|
||||||
|
model.save("{}".format(output_path))
|
||||||
|
print("Saved Keras model to {}".format(output_path))
|
||||||
|
|
||||||
|
# Check to see if all weights have been read.
|
||||||
|
remaining_weights = len(weights_file.read()) / 4
|
||||||
|
weights_file.close()
|
||||||
|
print(
|
||||||
|
"Read {} of {} from Darknet weights.".format(count, count + remaining_weights)
|
||||||
|
)
|
||||||
|
if remaining_weights > 0:
|
||||||
|
print("Warning: {} unused weights".format(remaining_weights))
|
||||||
|
|
||||||
|
if args.plot_model:
|
||||||
|
plot(model, to_file="{}.png".format(output_root), show_shapes=True)
|
||||||
|
print("Saved model plot to {}.png".format(output_root))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
_main(parser.parse_args())
|
||||||
@@ -0,0 +1,548 @@
|
|||||||
|
[net]
|
||||||
|
# Testing
|
||||||
|
batch=1
|
||||||
|
subdivisions=1
|
||||||
|
# Training
|
||||||
|
# batch=64
|
||||||
|
# subdivisions=16
|
||||||
|
width=640
|
||||||
|
height=640
|
||||||
|
channels=3
|
||||||
|
momentum=0.9
|
||||||
|
decay=0.0005
|
||||||
|
angle=0
|
||||||
|
saturation = 1.5
|
||||||
|
exposure = 1.5
|
||||||
|
hue=.1
|
||||||
|
|
||||||
|
learning_rate=0.001
|
||||||
|
burn_in=1000
|
||||||
|
max_batches = 500200
|
||||||
|
policy=steps
|
||||||
|
steps=400000,450000
|
||||||
|
scales=.1,.1
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=32
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=64
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=32
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=64
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=64
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=64
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
@@ -0,0 +1,45 @@
|
|||||||
|
Copyright (c) 2014, Mozilla Foundation https://mozilla.org/ with Reserved Font Name Fira Mono.
|
||||||
|
|
||||||
|
Copyright (c) 2014, Telefonica S.A.
|
||||||
|
|
||||||
|
This Font Software is licensed under the SIL Open Font License, Version 1.1.
|
||||||
|
This license is copied below, and is also available with a FAQ at: http://scripts.sil.org/OFL
|
||||||
|
|
||||||
|
-----------------------------------------------------------
|
||||||
|
SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
|
||||||
|
-----------------------------------------------------------
|
||||||
|
|
||||||
|
PREAMBLE
|
||||||
|
The goals of the Open Font License (OFL) are to stimulate worldwide development of collaborative font projects, to support the font creation efforts of academic and linguistic communities, and to provide a free and open framework in which fonts may be shared and improved in partnership with others.
|
||||||
|
|
||||||
|
The OFL allows the licensed fonts to be used, studied, modified and redistributed freely as long as they are not sold by themselves. The fonts, including any derivative works, can be bundled, embedded, redistributed and/or sold with any software provided that any reserved names are not used by derivative works. The fonts and derivatives, however, cannot be released under any other type of license. The requirement for fonts to remain under this license does not apply to any document created using the fonts or their derivatives.
|
||||||
|
|
||||||
|
DEFINITIONS
|
||||||
|
"Font Software" refers to the set of files released by the Copyright Holder(s) under this license and clearly marked as such. This may include source files, build scripts and documentation.
|
||||||
|
|
||||||
|
"Reserved Font Name" refers to any names specified as such after the copyright statement(s).
|
||||||
|
|
||||||
|
"Original Version" refers to the collection of Font Software components as distributed by the Copyright Holder(s).
|
||||||
|
|
||||||
|
"Modified Version" refers to any derivative made by adding to, deleting, or substituting -- in part or in whole -- any of the components of the Original Version, by changing formats or by porting the Font Software to a new environment.
|
||||||
|
|
||||||
|
"Author" refers to any designer, engineer, programmer, technical writer or other person who contributed to the Font Software.
|
||||||
|
|
||||||
|
PERMISSION & CONDITIONS
|
||||||
|
Permission is hereby granted, free of charge, to any person obtaining a copy of the Font Software, to use, study, copy, merge, embed, modify, redistribute, and sell modified and unmodified copies of the Font Software, subject to the following conditions:
|
||||||
|
|
||||||
|
1) Neither the Font Software nor any of its individual components, in Original or Modified Versions, may be sold by itself.
|
||||||
|
|
||||||
|
2) Original or Modified Versions of the Font Software may be bundled, redistributed and/or sold with any software, provided that each copy contains the above copyright notice and this license. These can be included either as stand-alone text files, human-readable headers or in the appropriate machine-readable metadata fields within text or binary files as long as those fields can be easily viewed by the user.
|
||||||
|
|
||||||
|
3) No Modified Version of the Font Software may use the Reserved Font Name(s) unless explicit written permission is granted by the corresponding Copyright Holder. This restriction only applies to the primary font name as presented to the users.
|
||||||
|
|
||||||
|
4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font Software shall not be used to promote, endorse or advertise any Modified Version, except to acknowledge the contribution(s) of the Copyright Holder(s) and the Author(s) or with their explicit written permission.
|
||||||
|
|
||||||
|
5) The Font Software, modified or unmodified, in part or in whole, must be distributed entirely under this license, and must not be distributed under any other license. The requirement for fonts to remain under this license does not apply to any document created using the Font Software.
|
||||||
|
|
||||||
|
TERMINATION
|
||||||
|
This license becomes null and void if any of the above conditions are not met.
|
||||||
|
|
||||||
|
DISCLAIMER
|
||||||
|
THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM OTHER DEALINGS IN THE FONT SOFTWARE.
|
||||||
@@ -0,0 +1,99 @@
|
|||||||
|
import numpy as np
|
||||||
|
|
||||||
|
|
||||||
|
class YOLO_Kmeans:
|
||||||
|
def __init__(self, cluster_number, filename):
|
||||||
|
self.cluster_number = cluster_number
|
||||||
|
self.filename = "2012_train.txt"
|
||||||
|
|
||||||
|
def iou(self, boxes, clusters): # 1 box -> k clusters
|
||||||
|
n = boxes.shape[0]
|
||||||
|
k = self.cluster_number
|
||||||
|
|
||||||
|
box_area = boxes[:, 0] * boxes[:, 1]
|
||||||
|
box_area = box_area.repeat(k)
|
||||||
|
box_area = np.reshape(box_area, (n, k))
|
||||||
|
|
||||||
|
cluster_area = clusters[:, 0] * clusters[:, 1]
|
||||||
|
cluster_area = np.tile(cluster_area, [1, n])
|
||||||
|
cluster_area = np.reshape(cluster_area, (n, k))
|
||||||
|
|
||||||
|
box_w_matrix = np.reshape(boxes[:, 0].repeat(k), (n, k))
|
||||||
|
cluster_w_matrix = np.reshape(np.tile(clusters[:, 0], (1, n)), (n, k))
|
||||||
|
min_w_matrix = np.minimum(cluster_w_matrix, box_w_matrix)
|
||||||
|
|
||||||
|
box_h_matrix = np.reshape(boxes[:, 1].repeat(k), (n, k))
|
||||||
|
cluster_h_matrix = np.reshape(np.tile(clusters[:, 1], (1, n)), (n, k))
|
||||||
|
min_h_matrix = np.minimum(cluster_h_matrix, box_h_matrix)
|
||||||
|
inter_area = np.multiply(min_w_matrix, min_h_matrix)
|
||||||
|
|
||||||
|
result = inter_area / (box_area + cluster_area - inter_area)
|
||||||
|
return result
|
||||||
|
|
||||||
|
def avg_iou(self, boxes, clusters):
|
||||||
|
accuracy = np.mean([np.max(self.iou(boxes, clusters), axis=1)])
|
||||||
|
return accuracy
|
||||||
|
|
||||||
|
def kmeans(self, boxes, k, dist=np.median):
|
||||||
|
box_number = boxes.shape[0]
|
||||||
|
distances = np.empty((box_number, k))
|
||||||
|
last_nearest = np.zeros((box_number,))
|
||||||
|
np.random.seed()
|
||||||
|
clusters = boxes[
|
||||||
|
np.random.choice(box_number, k, replace=False)
|
||||||
|
] # init k clusters
|
||||||
|
while True:
|
||||||
|
|
||||||
|
distances = 1 - self.iou(boxes, clusters)
|
||||||
|
|
||||||
|
current_nearest = np.argmin(distances, axis=1)
|
||||||
|
if (last_nearest == current_nearest).all():
|
||||||
|
break # clusters won't change
|
||||||
|
for cluster in range(k):
|
||||||
|
clusters[cluster] = dist( # update clusters
|
||||||
|
boxes[current_nearest == cluster], axis=0
|
||||||
|
)
|
||||||
|
|
||||||
|
last_nearest = current_nearest
|
||||||
|
|
||||||
|
return clusters
|
||||||
|
|
||||||
|
def result2txt(self, data):
|
||||||
|
f = open("yolo_anchors.txt", "w")
|
||||||
|
row = np.shape(data)[0]
|
||||||
|
for i in range(row):
|
||||||
|
if i == 0:
|
||||||
|
x_y = "%d,%d" % (data[i][0], data[i][1])
|
||||||
|
else:
|
||||||
|
x_y = ", %d,%d" % (data[i][0], data[i][1])
|
||||||
|
f.write(x_y)
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
def txt2boxes(self):
|
||||||
|
f = open(self.filename, "r")
|
||||||
|
dataSet = []
|
||||||
|
for line in f:
|
||||||
|
infos = line.split(" ")
|
||||||
|
length = len(infos)
|
||||||
|
for i in range(1, length):
|
||||||
|
width = int(infos[i].split(",")[2]) - int(infos[i].split(",")[0])
|
||||||
|
height = int(infos[i].split(",")[3]) - int(infos[i].split(",")[1])
|
||||||
|
dataSet.append([width, height])
|
||||||
|
result = np.array(dataSet)
|
||||||
|
f.close()
|
||||||
|
return result
|
||||||
|
|
||||||
|
def txt2clusters(self):
|
||||||
|
all_boxes = self.txt2boxes()
|
||||||
|
result = self.kmeans(all_boxes, k=self.cluster_number)
|
||||||
|
result = result[np.lexsort(result.T[0, None])]
|
||||||
|
self.result2txt(result)
|
||||||
|
print("K anchors:\n {}".format(result))
|
||||||
|
print("Accuracy: {:.2f}%".format(self.avg_iou(all_boxes, result) * 100))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
cluster_number = 9
|
||||||
|
filename = "2012_train.txt"
|
||||||
|
kmeans = YOLO_Kmeans(cluster_number, filename)
|
||||||
|
kmeans.txt2clusters()
|
||||||
@@ -0,0 +1,80 @@
|
|||||||
|
person
|
||||||
|
bicycle
|
||||||
|
car
|
||||||
|
motorbike
|
||||||
|
aeroplane
|
||||||
|
bus
|
||||||
|
train
|
||||||
|
truck
|
||||||
|
boat
|
||||||
|
traffic light
|
||||||
|
fire hydrant
|
||||||
|
stop sign
|
||||||
|
parking meter
|
||||||
|
bench
|
||||||
|
bird
|
||||||
|
cat
|
||||||
|
dog
|
||||||
|
horse
|
||||||
|
sheep
|
||||||
|
cow
|
||||||
|
elephant
|
||||||
|
bear
|
||||||
|
zebra
|
||||||
|
giraffe
|
||||||
|
backpack
|
||||||
|
umbrella
|
||||||
|
handbag
|
||||||
|
tie
|
||||||
|
suitcase
|
||||||
|
frisbee
|
||||||
|
skis
|
||||||
|
snowboard
|
||||||
|
sports ball
|
||||||
|
kite
|
||||||
|
baseball bat
|
||||||
|
baseball glove
|
||||||
|
skateboard
|
||||||
|
surfboard
|
||||||
|
tennis racket
|
||||||
|
bottle
|
||||||
|
wine glass
|
||||||
|
cup
|
||||||
|
fork
|
||||||
|
knife
|
||||||
|
spoon
|
||||||
|
bowl
|
||||||
|
banana
|
||||||
|
apple
|
||||||
|
sandwich
|
||||||
|
orange
|
||||||
|
broccoli
|
||||||
|
carrot
|
||||||
|
hot dog
|
||||||
|
pizza
|
||||||
|
donut
|
||||||
|
cake
|
||||||
|
chair
|
||||||
|
sofa
|
||||||
|
pottedplant
|
||||||
|
bed
|
||||||
|
diningtable
|
||||||
|
toilet
|
||||||
|
tvmonitor
|
||||||
|
laptop
|
||||||
|
mouse
|
||||||
|
remote
|
||||||
|
keyboard
|
||||||
|
cell phone
|
||||||
|
microwave
|
||||||
|
oven
|
||||||
|
toaster
|
||||||
|
sink
|
||||||
|
refrigerator
|
||||||
|
book
|
||||||
|
clock
|
||||||
|
vase
|
||||||
|
scissors
|
||||||
|
teddy bear
|
||||||
|
hair drier
|
||||||
|
toothbrush
|
||||||
@@ -0,0 +1,20 @@
|
|||||||
|
aeroplane
|
||||||
|
bicycle
|
||||||
|
bird
|
||||||
|
boat
|
||||||
|
bottle
|
||||||
|
bus
|
||||||
|
car
|
||||||
|
cat
|
||||||
|
chair
|
||||||
|
cow
|
||||||
|
diningtable
|
||||||
|
dog
|
||||||
|
horse
|
||||||
|
motorbike
|
||||||
|
person
|
||||||
|
pottedplant
|
||||||
|
sheep
|
||||||
|
sofa
|
||||||
|
train
|
||||||
|
tvmonitor
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
10,14, 23,27, 37,58, 81,82, 135,169, 344,319
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
|
||||||
@@ -0,0 +1,316 @@
|
|||||||
|
"""
|
||||||
|
Retrain the YOLO model for your own dataset.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import keras.backend as K
|
||||||
|
from keras.layers import Input, Lambda
|
||||||
|
from keras.models import Model
|
||||||
|
from keras.optimizers import Adam
|
||||||
|
from keras.callbacks import (
|
||||||
|
TensorBoard,
|
||||||
|
ModelCheckpoint,
|
||||||
|
ReduceLROnPlateau,
|
||||||
|
EarlyStopping,
|
||||||
|
)
|
||||||
|
|
||||||
|
from yolo3.model import preprocess_true_boxes, yolo_body, tiny_yolo_body, yolo_loss
|
||||||
|
from yolo3.utils import get_random_data
|
||||||
|
|
||||||
|
|
||||||
|
def _main():
|
||||||
|
annotation_path = "data_train.txt"
|
||||||
|
log_dir = "logs/003/"
|
||||||
|
classes_path = "data_classes.txt"
|
||||||
|
# anchors_path = 'model_data/yolo-tiny_anchors.txt'
|
||||||
|
anchors_path = "model_data/yolo_anchors.txt"
|
||||||
|
class_names = get_classes(classes_path)
|
||||||
|
num_classes = len(class_names)
|
||||||
|
anchors = get_anchors(anchors_path)
|
||||||
|
|
||||||
|
input_shape = (416, 416) # multiple of 32, hw
|
||||||
|
epoch1, epoch2 = 40, 40
|
||||||
|
|
||||||
|
is_tiny_version = len(anchors) == 6 # default setting
|
||||||
|
if is_tiny_version:
|
||||||
|
model = create_tiny_model(
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
freeze_body=2,
|
||||||
|
weights_path="model_data/yolo-tiny.h5",
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
model = create_model(
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
freeze_body=2,
|
||||||
|
weights_path="model_data/yolo.h5",
|
||||||
|
) # make sure you know what you freeze
|
||||||
|
|
||||||
|
logging = TensorBoard(log_dir=log_dir)
|
||||||
|
# checkpoint = ModelCheckpoint(log_dir + 'ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5',
|
||||||
|
# monitor='val_loss', save_weights_only=True, save_best_only=True, period=3)
|
||||||
|
checkpoint = ModelCheckpoint(
|
||||||
|
log_dir + "checkpoint.h5",
|
||||||
|
monitor="val_loss",
|
||||||
|
save_weights_only=True,
|
||||||
|
save_best_only=True,
|
||||||
|
period=5,
|
||||||
|
)
|
||||||
|
reduce_lr = ReduceLROnPlateau(monitor="val_loss", factor=0.1, patience=3, verbose=1)
|
||||||
|
early_stopping = EarlyStopping(
|
||||||
|
monitor="val_loss", min_delta=0, patience=10, verbose=1
|
||||||
|
)
|
||||||
|
|
||||||
|
val_split = 0.1
|
||||||
|
with open(annotation_path) as f:
|
||||||
|
lines = f.readlines()
|
||||||
|
np.random.seed(10101)
|
||||||
|
np.random.shuffle(lines)
|
||||||
|
np.random.seed(None)
|
||||||
|
num_val = int(len(lines) * val_split)
|
||||||
|
num_train = len(lines) - num_val
|
||||||
|
|
||||||
|
# Train with frozen layers first, to get a stable loss.
|
||||||
|
# Adjust num epochs to your dataset. This step is enough to obtain a not bad model.
|
||||||
|
if True:
|
||||||
|
model.compile(
|
||||||
|
optimizer=Adam(lr=1e-3),
|
||||||
|
loss={
|
||||||
|
# use custom yolo_loss Lambda layer.
|
||||||
|
"yolo_loss": lambda y_true, y_pred: y_pred
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
batch_size = 32
|
||||||
|
print(
|
||||||
|
"Train on {} samples, val on {} samples, with batch size {}.".format(
|
||||||
|
num_train, num_val, batch_size
|
||||||
|
)
|
||||||
|
)
|
||||||
|
model.fit_generator(
|
||||||
|
data_generator_wrapper(
|
||||||
|
lines[:num_train], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
steps_per_epoch=max(1, num_train // batch_size),
|
||||||
|
validation_data=data_generator_wrapper(
|
||||||
|
lines[num_train:], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
validation_steps=max(1, num_val // batch_size),
|
||||||
|
epochs=epoch1,
|
||||||
|
initial_epoch=0,
|
||||||
|
callbacks=[logging, checkpoint],
|
||||||
|
)
|
||||||
|
model.save_weights(log_dir + "trained_weights_stage_1.h5")
|
||||||
|
|
||||||
|
# Unfreeze and continue training, to fine-tune.
|
||||||
|
# Train longer if the result is not good.
|
||||||
|
if True:
|
||||||
|
for i in range(len(model.layers)):
|
||||||
|
model.layers[i].trainable = True
|
||||||
|
model.compile(
|
||||||
|
optimizer=Adam(lr=1e-4), loss={"yolo_loss": lambda y_true, y_pred: y_pred}
|
||||||
|
) # recompile to apply the change
|
||||||
|
print("Unfreeze all of the layers.")
|
||||||
|
|
||||||
|
batch_size = (
|
||||||
|
16 # note that more GPU memory is required after unfreezing the body
|
||||||
|
)
|
||||||
|
print(
|
||||||
|
"Train on {} samples, val on {} samples, with batch size {}.".format(
|
||||||
|
num_train, num_val, batch_size
|
||||||
|
)
|
||||||
|
)
|
||||||
|
model.fit_generator(
|
||||||
|
data_generator_wrapper(
|
||||||
|
lines[:num_train], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
steps_per_epoch=max(1, num_train // batch_size),
|
||||||
|
validation_data=data_generator_wrapper(
|
||||||
|
lines[num_train:], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
validation_steps=max(1, num_val // batch_size),
|
||||||
|
epochs=epoch1 + epoch2,
|
||||||
|
initial_epoch=epoch1,
|
||||||
|
callbacks=[logging, checkpoint, reduce_lr, early_stopping],
|
||||||
|
)
|
||||||
|
model.save_weights(log_dir + "trained_weights_final.h5")
|
||||||
|
|
||||||
|
# Further training if needed.
|
||||||
|
|
||||||
|
|
||||||
|
def get_classes(classes_path):
|
||||||
|
"""loads the classes"""
|
||||||
|
with open(classes_path) as f:
|
||||||
|
class_names = f.readlines()
|
||||||
|
class_names = [c.strip() for c in class_names]
|
||||||
|
return class_names
|
||||||
|
|
||||||
|
|
||||||
|
def get_anchors(anchors_path):
|
||||||
|
"""loads the anchors from a file"""
|
||||||
|
with open(anchors_path) as f:
|
||||||
|
anchors = f.readline()
|
||||||
|
anchors = [float(x) for x in anchors.split(",")]
|
||||||
|
return np.array(anchors).reshape(-1, 2)
|
||||||
|
|
||||||
|
|
||||||
|
def create_model(
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
load_pretrained=True,
|
||||||
|
freeze_body=2,
|
||||||
|
weights_path="model_data/yolo_weights.h5",
|
||||||
|
):
|
||||||
|
"""create the training model"""
|
||||||
|
K.clear_session() # get a new session
|
||||||
|
image_input = Input(shape=(None, None, 3))
|
||||||
|
h, w = input_shape
|
||||||
|
num_anchors = len(anchors)
|
||||||
|
|
||||||
|
y_true = [
|
||||||
|
Input(
|
||||||
|
shape=(
|
||||||
|
h // {0: 32, 1: 16, 2: 8}[l],
|
||||||
|
w // {0: 32, 1: 16, 2: 8}[l],
|
||||||
|
num_anchors // 3,
|
||||||
|
num_classes + 5,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
for l in range(3)
|
||||||
|
]
|
||||||
|
|
||||||
|
model_body = yolo_body(image_input, num_anchors // 3, num_classes)
|
||||||
|
print(
|
||||||
|
"Create YOLOv3 model with {} anchors and {} classes.".format(
|
||||||
|
num_anchors, num_classes
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
if load_pretrained:
|
||||||
|
model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)
|
||||||
|
print("Load weights {}.".format(weights_path))
|
||||||
|
if freeze_body in [1, 2]:
|
||||||
|
# Freeze darknet53 body or freeze all but 3 output layers.
|
||||||
|
num = (185, len(model_body.layers) - 3)[freeze_body - 1]
|
||||||
|
for i in range(num):
|
||||||
|
model_body.layers[i].trainable = False
|
||||||
|
print(
|
||||||
|
"Freeze the first {} layers of total {} layers.".format(
|
||||||
|
num, len(model_body.layers)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
model_loss = Lambda(
|
||||||
|
yolo_loss,
|
||||||
|
output_shape=(1,),
|
||||||
|
name="yolo_loss",
|
||||||
|
arguments={
|
||||||
|
"anchors": anchors,
|
||||||
|
"num_classes": num_classes,
|
||||||
|
"ignore_thresh": 0.5,
|
||||||
|
},
|
||||||
|
)([*model_body.output, *y_true])
|
||||||
|
model = Model([model_body.input, *y_true], model_loss)
|
||||||
|
|
||||||
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def create_tiny_model(
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
load_pretrained=True,
|
||||||
|
freeze_body=2,
|
||||||
|
weights_path="model_data/tiny_yolo_weights.h5",
|
||||||
|
):
|
||||||
|
"""create the training model, for Tiny YOLOv3"""
|
||||||
|
K.clear_session() # get a new session
|
||||||
|
image_input = Input(shape=(None, None, 3))
|
||||||
|
h, w = input_shape
|
||||||
|
num_anchors = len(anchors)
|
||||||
|
|
||||||
|
y_true = [
|
||||||
|
Input(
|
||||||
|
shape=(
|
||||||
|
h // {0: 32, 1: 16}[l],
|
||||||
|
w // {0: 32, 1: 16}[l],
|
||||||
|
num_anchors // 2,
|
||||||
|
num_classes + 5,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
for l in range(2)
|
||||||
|
]
|
||||||
|
|
||||||
|
model_body = tiny_yolo_body(image_input, num_anchors // 2, num_classes)
|
||||||
|
print(
|
||||||
|
"Create Tiny YOLOv3 model with {} anchors and {} classes.".format(
|
||||||
|
num_anchors, num_classes
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
if load_pretrained:
|
||||||
|
model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)
|
||||||
|
print("Load weights {}.".format(weights_path))
|
||||||
|
if freeze_body in [1, 2]:
|
||||||
|
# Freeze the darknet body or freeze all but 2 output layers.
|
||||||
|
num = (20, len(model_body.layers) - 2)[freeze_body - 1]
|
||||||
|
for i in range(num):
|
||||||
|
model_body.layers[i].trainable = False
|
||||||
|
print(
|
||||||
|
"Freeze the first {} layers of total {} layers.".format(
|
||||||
|
num, len(model_body.layers)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
model_loss = Lambda(
|
||||||
|
yolo_loss,
|
||||||
|
output_shape=(1,),
|
||||||
|
name="yolo_loss",
|
||||||
|
arguments={
|
||||||
|
"anchors": anchors,
|
||||||
|
"num_classes": num_classes,
|
||||||
|
"ignore_thresh": 0.7,
|
||||||
|
},
|
||||||
|
)([*model_body.output, *y_true])
|
||||||
|
model = Model([model_body.input, *y_true], model_loss)
|
||||||
|
|
||||||
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def data_generator(annotation_lines, batch_size, input_shape, anchors, num_classes):
|
||||||
|
"""data generator for fit_generator"""
|
||||||
|
n = len(annotation_lines)
|
||||||
|
i = 0
|
||||||
|
while True:
|
||||||
|
image_data = []
|
||||||
|
box_data = []
|
||||||
|
for b in range(batch_size):
|
||||||
|
if i == 0:
|
||||||
|
np.random.shuffle(annotation_lines)
|
||||||
|
image, box = get_random_data(annotation_lines[i], input_shape, random=True)
|
||||||
|
image_data.append(image)
|
||||||
|
box_data.append(box)
|
||||||
|
i = (i + 1) % n
|
||||||
|
image_data = np.array(image_data)
|
||||||
|
box_data = np.array(box_data)
|
||||||
|
y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes)
|
||||||
|
yield [image_data, *y_true], np.zeros(batch_size)
|
||||||
|
|
||||||
|
|
||||||
|
def data_generator_wrapper(
|
||||||
|
annotation_lines, batch_size, input_shape, anchors, num_classes
|
||||||
|
):
|
||||||
|
n = len(annotation_lines)
|
||||||
|
if n == 0 or batch_size <= 0:
|
||||||
|
return None
|
||||||
|
return data_generator(
|
||||||
|
annotation_lines, batch_size, input_shape, anchors, num_classes
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
_main()
|
||||||
@@ -0,0 +1,404 @@
|
|||||||
|
"""
|
||||||
|
Retrain the YOLO model for your own dataset.
|
||||||
|
"""
|
||||||
|
import os
|
||||||
|
import numpy as np
|
||||||
|
import keras.backend as K
|
||||||
|
from keras.layers import Input, Lambda
|
||||||
|
from keras.models import Model
|
||||||
|
from keras.optimizers import Adam
|
||||||
|
from keras.callbacks import (
|
||||||
|
TensorBoard,
|
||||||
|
ModelCheckpoint,
|
||||||
|
ReduceLROnPlateau,
|
||||||
|
EarlyStopping,
|
||||||
|
)
|
||||||
|
|
||||||
|
from yolo3.model import preprocess_true_boxes, yolo_body, tiny_yolo_body, yolo_loss
|
||||||
|
from yolo3.utils import get_random_data
|
||||||
|
|
||||||
|
|
||||||
|
def _main():
|
||||||
|
annotation_path = "train.txt"
|
||||||
|
log_dir = "logs/000/"
|
||||||
|
classes_path = "model_data/coco_classes.txt"
|
||||||
|
anchors_path = "model_data/yolo_anchors.txt"
|
||||||
|
class_names = get_classes(classes_path)
|
||||||
|
num_classes = len(class_names)
|
||||||
|
anchors = get_anchors(anchors_path)
|
||||||
|
|
||||||
|
input_shape = (640, 640) # multiple of 32, hw
|
||||||
|
|
||||||
|
model, bottleneck_model, last_layer_model = create_model(
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
freeze_body=2,
|
||||||
|
weights_path="model_data/yolo_weights.h5",
|
||||||
|
) # make sure you know what you freeze
|
||||||
|
|
||||||
|
logging = TensorBoard(log_dir=log_dir)
|
||||||
|
checkpoint = ModelCheckpoint(
|
||||||
|
log_dir + "ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5",
|
||||||
|
monitor="val_loss",
|
||||||
|
save_weights_only=True,
|
||||||
|
save_best_only=True,
|
||||||
|
period=3,
|
||||||
|
)
|
||||||
|
reduce_lr = ReduceLROnPlateau(monitor="val_loss", factor=0.1, patience=3, verbose=1)
|
||||||
|
early_stopping = EarlyStopping(
|
||||||
|
monitor="val_loss", min_delta=0, patience=10, verbose=1
|
||||||
|
)
|
||||||
|
|
||||||
|
val_split = 0.1
|
||||||
|
with open(annotation_path) as f:
|
||||||
|
lines = f.readlines()
|
||||||
|
np.random.seed(10101)
|
||||||
|
np.random.shuffle(lines)
|
||||||
|
np.random.seed(None)
|
||||||
|
num_val = int(len(lines) * val_split)
|
||||||
|
num_train = len(lines) - num_val
|
||||||
|
|
||||||
|
# Train with frozen layers first, to get a stable loss.
|
||||||
|
# Adjust num epochs to your dataset. This step is enough to obtain a not bad model.
|
||||||
|
if True:
|
||||||
|
# perform bottleneck training
|
||||||
|
if not os.path.isfile("bottlenecks.npz"):
|
||||||
|
print("calculating bottlenecks")
|
||||||
|
batch_size = 8
|
||||||
|
bottlenecks = bottleneck_model.predict_generator(
|
||||||
|
data_generator_wrapper(
|
||||||
|
lines,
|
||||||
|
batch_size,
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
random=False,
|
||||||
|
verbose=True,
|
||||||
|
),
|
||||||
|
steps=(len(lines) // batch_size) + 1,
|
||||||
|
max_queue_size=1,
|
||||||
|
)
|
||||||
|
np.savez(
|
||||||
|
"bottlenecks.npz",
|
||||||
|
bot0=bottlenecks[0],
|
||||||
|
bot1=bottlenecks[1],
|
||||||
|
bot2=bottlenecks[2],
|
||||||
|
)
|
||||||
|
|
||||||
|
# load bottleneck features from file
|
||||||
|
dict_bot = np.load("bottlenecks.npz")
|
||||||
|
bottlenecks_train = [
|
||||||
|
dict_bot["bot0"][:num_train],
|
||||||
|
dict_bot["bot1"][:num_train],
|
||||||
|
dict_bot["bot2"][:num_train],
|
||||||
|
]
|
||||||
|
bottlenecks_val = [
|
||||||
|
dict_bot["bot0"][num_train:],
|
||||||
|
dict_bot["bot1"][num_train:],
|
||||||
|
dict_bot["bot2"][num_train:],
|
||||||
|
]
|
||||||
|
|
||||||
|
# train last layers with fixed bottleneck features
|
||||||
|
batch_size = 8
|
||||||
|
print("Training last layers with bottleneck features")
|
||||||
|
print(
|
||||||
|
"with {} samples, val on {} samples and batch size {}.".format(
|
||||||
|
num_train, num_val, batch_size
|
||||||
|
)
|
||||||
|
)
|
||||||
|
last_layer_model.compile(
|
||||||
|
optimizer="adam", loss={"yolo_loss": lambda y_true, y_pred: y_pred}
|
||||||
|
)
|
||||||
|
last_layer_model.fit_generator(
|
||||||
|
bottleneck_generator(
|
||||||
|
lines[:num_train],
|
||||||
|
batch_size,
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
bottlenecks_train,
|
||||||
|
),
|
||||||
|
steps_per_epoch=max(1, num_train // batch_size),
|
||||||
|
validation_data=bottleneck_generator(
|
||||||
|
lines[num_train:],
|
||||||
|
batch_size,
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
bottlenecks_val,
|
||||||
|
),
|
||||||
|
validation_steps=max(1, num_val // batch_size),
|
||||||
|
epochs=30,
|
||||||
|
initial_epoch=0,
|
||||||
|
max_queue_size=1,
|
||||||
|
)
|
||||||
|
model.save_weights(log_dir + "trained_weights_stage_0.h5")
|
||||||
|
|
||||||
|
# train last layers with random augmented data
|
||||||
|
model.compile(
|
||||||
|
optimizer=Adam(lr=1e-3),
|
||||||
|
loss={
|
||||||
|
# use custom yolo_loss Lambda layer.
|
||||||
|
"yolo_loss": lambda y_true, y_pred: y_pred
|
||||||
|
},
|
||||||
|
)
|
||||||
|
batch_size = 16
|
||||||
|
print(
|
||||||
|
"Train on {} samples, val on {} samples, with batch size {}.".format(
|
||||||
|
num_train, num_val, batch_size
|
||||||
|
)
|
||||||
|
)
|
||||||
|
model.fit_generator(
|
||||||
|
data_generator_wrapper(
|
||||||
|
lines[:num_train], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
steps_per_epoch=max(1, num_train // batch_size),
|
||||||
|
validation_data=data_generator_wrapper(
|
||||||
|
lines[num_train:], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
validation_steps=max(1, num_val // batch_size),
|
||||||
|
epochs=50,
|
||||||
|
initial_epoch=0,
|
||||||
|
callbacks=[logging, checkpoint],
|
||||||
|
)
|
||||||
|
model.save_weights(log_dir + "trained_weights_stage_1.h5")
|
||||||
|
|
||||||
|
# Unfreeze and continue training, to fine-tune.
|
||||||
|
# Train longer if the result is not good.
|
||||||
|
if True:
|
||||||
|
for i in range(len(model.layers)):
|
||||||
|
model.layers[i].trainable = True
|
||||||
|
model.compile(
|
||||||
|
optimizer=Adam(lr=1e-4), loss={"yolo_loss": lambda y_true, y_pred: y_pred}
|
||||||
|
) # recompile to apply the change
|
||||||
|
print("Unfreeze all of the layers.")
|
||||||
|
|
||||||
|
batch_size = (
|
||||||
|
4 # note that more GPU memory is required after unfreezing the body
|
||||||
|
)
|
||||||
|
print(
|
||||||
|
"Train on {} samples, val on {} samples, with batch size {}.".format(
|
||||||
|
num_train, num_val, batch_size
|
||||||
|
)
|
||||||
|
)
|
||||||
|
model.fit_generator(
|
||||||
|
data_generator_wrapper(
|
||||||
|
lines[:num_train], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
steps_per_epoch=max(1, num_train // batch_size),
|
||||||
|
validation_data=data_generator_wrapper(
|
||||||
|
lines[num_train:], batch_size, input_shape, anchors, num_classes
|
||||||
|
),
|
||||||
|
validation_steps=max(1, num_val // batch_size),
|
||||||
|
epochs=100,
|
||||||
|
initial_epoch=50,
|
||||||
|
callbacks=[logging, checkpoint, reduce_lr, early_stopping],
|
||||||
|
)
|
||||||
|
model.save_weights(log_dir + "trained_weights_final.h5")
|
||||||
|
|
||||||
|
# Further training if needed.
|
||||||
|
|
||||||
|
|
||||||
|
def get_classes(classes_path):
|
||||||
|
"""loads the classes"""
|
||||||
|
with open(classes_path) as f:
|
||||||
|
class_names = f.readlines()
|
||||||
|
class_names = [c.strip() for c in class_names]
|
||||||
|
return class_names
|
||||||
|
|
||||||
|
|
||||||
|
def get_anchors(anchors_path):
|
||||||
|
"""loads the anchors from a file"""
|
||||||
|
with open(anchors_path) as f:
|
||||||
|
anchors = f.readline()
|
||||||
|
anchors = [float(x) for x in anchors.split(",")]
|
||||||
|
return np.array(anchors).reshape(-1, 2)
|
||||||
|
|
||||||
|
|
||||||
|
def create_model(
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
load_pretrained=True,
|
||||||
|
freeze_body=2,
|
||||||
|
weights_path="model_data/yolo_weights.h5",
|
||||||
|
):
|
||||||
|
"""create the training model"""
|
||||||
|
K.clear_session() # get a new session
|
||||||
|
image_input = Input(shape=(None, None, 3))
|
||||||
|
h, w = input_shape
|
||||||
|
num_anchors = len(anchors)
|
||||||
|
|
||||||
|
y_true = [
|
||||||
|
Input(
|
||||||
|
shape=(
|
||||||
|
h // {0: 32, 1: 16, 2: 8}[l],
|
||||||
|
w // {0: 32, 1: 16, 2: 8}[l],
|
||||||
|
num_anchors // 3,
|
||||||
|
num_classes + 5,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
for l in range(3)
|
||||||
|
]
|
||||||
|
|
||||||
|
model_body = yolo_body(image_input, num_anchors // 3, num_classes)
|
||||||
|
print(
|
||||||
|
"Create YOLOv3 model with {} anchors and {} classes.".format(
|
||||||
|
num_anchors, num_classes
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
if load_pretrained:
|
||||||
|
model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)
|
||||||
|
print("Load weights {}.".format(weights_path))
|
||||||
|
if freeze_body in [1, 2]:
|
||||||
|
# Freeze darknet53 body or freeze all but 3 output layers.
|
||||||
|
num = (185, len(model_body.layers) - 3)[freeze_body - 1]
|
||||||
|
for i in range(num):
|
||||||
|
model_body.layers[i].trainable = False
|
||||||
|
print(
|
||||||
|
"Freeze the first {} layers of total {} layers.".format(
|
||||||
|
num, len(model_body.layers)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# get output of second last layers and create bottleneck model of it
|
||||||
|
out1 = model_body.layers[246].output
|
||||||
|
out2 = model_body.layers[247].output
|
||||||
|
out3 = model_body.layers[248].output
|
||||||
|
bottleneck_model = Model([model_body.input, *y_true], [out1, out2, out3])
|
||||||
|
|
||||||
|
# create last layer model of last layers from yolo model
|
||||||
|
in0 = Input(shape=bottleneck_model.output[0].shape[1:].as_list())
|
||||||
|
in1 = Input(shape=bottleneck_model.output[1].shape[1:].as_list())
|
||||||
|
in2 = Input(shape=bottleneck_model.output[2].shape[1:].as_list())
|
||||||
|
last_out0 = model_body.layers[249](in0)
|
||||||
|
last_out1 = model_body.layers[250](in1)
|
||||||
|
last_out2 = model_body.layers[251](in2)
|
||||||
|
model_last = Model(
|
||||||
|
inputs=[in0, in1, in2], outputs=[last_out0, last_out1, last_out2]
|
||||||
|
)
|
||||||
|
model_loss_last = Lambda(
|
||||||
|
yolo_loss,
|
||||||
|
output_shape=(1,),
|
||||||
|
name="yolo_loss",
|
||||||
|
arguments={
|
||||||
|
"anchors": anchors,
|
||||||
|
"num_classes": num_classes,
|
||||||
|
"ignore_thresh": 0.5,
|
||||||
|
},
|
||||||
|
)([*model_last.output, *y_true])
|
||||||
|
last_layer_model = Model([in0, in1, in2, *y_true], model_loss_last)
|
||||||
|
|
||||||
|
model_loss = Lambda(
|
||||||
|
yolo_loss,
|
||||||
|
output_shape=(1,),
|
||||||
|
name="yolo_loss",
|
||||||
|
arguments={
|
||||||
|
"anchors": anchors,
|
||||||
|
"num_classes": num_classes,
|
||||||
|
"ignore_thresh": 0.5,
|
||||||
|
},
|
||||||
|
)([*model_body.output, *y_true])
|
||||||
|
model = Model([model_body.input, *y_true], model_loss)
|
||||||
|
|
||||||
|
return model, bottleneck_model, last_layer_model
|
||||||
|
|
||||||
|
|
||||||
|
def data_generator(
|
||||||
|
annotation_lines,
|
||||||
|
batch_size,
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
random=True,
|
||||||
|
verbose=False,
|
||||||
|
):
|
||||||
|
"""data generator for fit_generator"""
|
||||||
|
n = len(annotation_lines)
|
||||||
|
i = 0
|
||||||
|
while True:
|
||||||
|
image_data = []
|
||||||
|
box_data = []
|
||||||
|
for b in range(batch_size):
|
||||||
|
if i == 0 and random:
|
||||||
|
np.random.shuffle(annotation_lines)
|
||||||
|
image, box = get_random_data(
|
||||||
|
annotation_lines[i], input_shape, random=random
|
||||||
|
)
|
||||||
|
image_data.append(image)
|
||||||
|
box_data.append(box)
|
||||||
|
i = (i + 1) % n
|
||||||
|
image_data = np.array(image_data)
|
||||||
|
if verbose:
|
||||||
|
print("Progress: ", i, "/", n)
|
||||||
|
box_data = np.array(box_data)
|
||||||
|
y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes)
|
||||||
|
yield [image_data, *y_true], np.zeros(batch_size)
|
||||||
|
|
||||||
|
|
||||||
|
def data_generator_wrapper(
|
||||||
|
annotation_lines,
|
||||||
|
batch_size,
|
||||||
|
input_shape,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
random=True,
|
||||||
|
verbose=False,
|
||||||
|
):
|
||||||
|
n = len(annotation_lines)
|
||||||
|
if n == 0 or batch_size <= 0:
|
||||||
|
return None
|
||||||
|
return data_generator(
|
||||||
|
annotation_lines, batch_size, input_shape, anchors, num_classes, random, verbose
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def bottleneck_generator(
|
||||||
|
annotation_lines, batch_size, input_shape, anchors, num_classes, bottlenecks
|
||||||
|
):
|
||||||
|
n = len(annotation_lines)
|
||||||
|
i = 0
|
||||||
|
while True:
|
||||||
|
box_data = []
|
||||||
|
b0 = np.zeros(
|
||||||
|
(
|
||||||
|
batch_size,
|
||||||
|
bottlenecks[0].shape[1],
|
||||||
|
bottlenecks[0].shape[2],
|
||||||
|
bottlenecks[0].shape[3],
|
||||||
|
)
|
||||||
|
)
|
||||||
|
b1 = np.zeros(
|
||||||
|
(
|
||||||
|
batch_size,
|
||||||
|
bottlenecks[1].shape[1],
|
||||||
|
bottlenecks[1].shape[2],
|
||||||
|
bottlenecks[1].shape[3],
|
||||||
|
)
|
||||||
|
)
|
||||||
|
b2 = np.zeros(
|
||||||
|
(
|
||||||
|
batch_size,
|
||||||
|
bottlenecks[2].shape[1],
|
||||||
|
bottlenecks[2].shape[2],
|
||||||
|
bottlenecks[2].shape[3],
|
||||||
|
)
|
||||||
|
)
|
||||||
|
for b in range(batch_size):
|
||||||
|
_, box = get_random_data(
|
||||||
|
annotation_lines[i], input_shape, random=False, proc_img=False
|
||||||
|
)
|
||||||
|
box_data.append(box)
|
||||||
|
b0[b] = bottlenecks[0][i]
|
||||||
|
b1[b] = bottlenecks[1][i]
|
||||||
|
b2[b] = bottlenecks[2][i]
|
||||||
|
i = (i + 1) % n
|
||||||
|
box_data = np.array(box_data)
|
||||||
|
y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes)
|
||||||
|
yield [b0, b1, b2, *y_true], np.zeros(batch_size)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
_main()
|
||||||
@@ -0,0 +1,65 @@
|
|||||||
|
import xml.etree.ElementTree as ET
|
||||||
|
from os import getcwd
|
||||||
|
|
||||||
|
sets = [("2007", "train"), ("2007", "val"), ("2007", "test")]
|
||||||
|
|
||||||
|
classes = [
|
||||||
|
"aeroplane",
|
||||||
|
"bicycle",
|
||||||
|
"bird",
|
||||||
|
"boat",
|
||||||
|
"bottle",
|
||||||
|
"bus",
|
||||||
|
"car",
|
||||||
|
"cat",
|
||||||
|
"chair",
|
||||||
|
"cow",
|
||||||
|
"diningtable",
|
||||||
|
"dog",
|
||||||
|
"horse",
|
||||||
|
"motorbike",
|
||||||
|
"person",
|
||||||
|
"pottedplant",
|
||||||
|
"sheep",
|
||||||
|
"sofa",
|
||||||
|
"train",
|
||||||
|
"tvmonitor",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def convert_annotation(year, image_id, list_file):
|
||||||
|
in_file = open("VOCdevkit/VOC%s/Annotations/%s.xml" % (year, image_id))
|
||||||
|
tree = ET.parse(in_file)
|
||||||
|
root = tree.getroot()
|
||||||
|
|
||||||
|
for obj in root.iter("object"):
|
||||||
|
difficult = obj.find("difficult").text
|
||||||
|
cls = obj.find("name").text
|
||||||
|
if cls not in classes or int(difficult) == 1:
|
||||||
|
continue
|
||||||
|
cls_id = classes.index(cls)
|
||||||
|
xmlbox = obj.find("bndbox")
|
||||||
|
b = (
|
||||||
|
int(xmlbox.find("xmin").text),
|
||||||
|
int(xmlbox.find("ymin").text),
|
||||||
|
int(xmlbox.find("xmax").text),
|
||||||
|
int(xmlbox.find("ymax").text),
|
||||||
|
)
|
||||||
|
list_file.write(" " + ",".join([str(a) for a in b]) + "," + str(cls_id))
|
||||||
|
|
||||||
|
|
||||||
|
wd = getcwd()
|
||||||
|
|
||||||
|
for year, image_set in sets:
|
||||||
|
image_ids = (
|
||||||
|
open("VOCdevkit/VOC%s/ImageSets/Main/%s.txt" % (year, image_set))
|
||||||
|
.read()
|
||||||
|
.strip()
|
||||||
|
.split()
|
||||||
|
)
|
||||||
|
list_file = open("%s_%s.txt" % (year, image_set), "w")
|
||||||
|
for image_id in image_ids:
|
||||||
|
list_file.write("%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg" % (wd, year, image_id))
|
||||||
|
convert_annotation(year, image_id, list_file)
|
||||||
|
list_file.write("\n")
|
||||||
|
list_file.close()
|
||||||
@@ -0,0 +1,285 @@
|
|||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
"""
|
||||||
|
Class definition of YOLO_v3 style detection model on image and video
|
||||||
|
"""
|
||||||
|
|
||||||
|
import colorsys
|
||||||
|
import os
|
||||||
|
from timeit import default_timer as timer
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
from keras import backend as K
|
||||||
|
from keras.models import load_model
|
||||||
|
from keras.layers import Input
|
||||||
|
from PIL import Image, ImageFont, ImageDraw
|
||||||
|
|
||||||
|
from .yolo3.model import yolo_eval, yolo_body, tiny_yolo_body
|
||||||
|
from .yolo3.utils import letterbox_image
|
||||||
|
import os
|
||||||
|
from keras.utils import multi_gpu_model
|
||||||
|
|
||||||
|
|
||||||
|
class YOLO(object):
|
||||||
|
_defaults = {
|
||||||
|
"model_path": "model_data/yolo.h5",
|
||||||
|
"anchors_path": "model_data/yolo_anchors.txt",
|
||||||
|
"classes_path": "model_data/coco_classes.txt",
|
||||||
|
"score": 0.3,
|
||||||
|
"iou": 0.45,
|
||||||
|
"model_image_size": (416, 416),
|
||||||
|
"gpu_num": 1,
|
||||||
|
}
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def get_defaults(cls, n):
|
||||||
|
if n in cls._defaults:
|
||||||
|
return cls._defaults[n]
|
||||||
|
else:
|
||||||
|
return "Unrecognized attribute name '" + n + "'"
|
||||||
|
|
||||||
|
def __init__(self, **kwargs):
|
||||||
|
self.__dict__.update(self._defaults) # set up default values
|
||||||
|
self.__dict__.update(kwargs) # and update with user overrides
|
||||||
|
self.class_names = self._get_class()
|
||||||
|
self.anchors = self._get_anchors()
|
||||||
|
self.sess = K.get_session()
|
||||||
|
self.boxes, self.scores, self.classes = self.generate()
|
||||||
|
|
||||||
|
def _get_class(self):
|
||||||
|
classes_path = os.path.expanduser(self.classes_path)
|
||||||
|
with open(classes_path) as f:
|
||||||
|
class_names = f.readlines()
|
||||||
|
class_names = [c.strip() for c in class_names]
|
||||||
|
return class_names
|
||||||
|
|
||||||
|
def _get_anchors(self):
|
||||||
|
anchors_path = os.path.expanduser(self.anchors_path)
|
||||||
|
with open(anchors_path) as f:
|
||||||
|
anchors = f.readline()
|
||||||
|
anchors = [float(x) for x in anchors.split(",")]
|
||||||
|
return np.array(anchors).reshape(-1, 2)
|
||||||
|
|
||||||
|
def generate(self):
|
||||||
|
model_path = os.path.expanduser(self.model_path)
|
||||||
|
assert model_path.endswith(".h5"), "Keras model or weights must be a .h5 file."
|
||||||
|
|
||||||
|
# Load model, or construct model and load weights.
|
||||||
|
start = timer()
|
||||||
|
num_anchors = len(self.anchors)
|
||||||
|
num_classes = len(self.class_names)
|
||||||
|
is_tiny_version = num_anchors == 6 # default setting
|
||||||
|
try:
|
||||||
|
self.yolo_model = load_model(model_path, compile=False)
|
||||||
|
except:
|
||||||
|
self.yolo_model = (
|
||||||
|
tiny_yolo_body(
|
||||||
|
Input(shape=(None, None, 3)), num_anchors // 2, num_classes
|
||||||
|
)
|
||||||
|
if is_tiny_version
|
||||||
|
else yolo_body(
|
||||||
|
Input(shape=(None, None, 3)), num_anchors // 3, num_classes
|
||||||
|
)
|
||||||
|
)
|
||||||
|
self.yolo_model.load_weights(
|
||||||
|
self.model_path
|
||||||
|
) # make sure model, anchors and classes match
|
||||||
|
else:
|
||||||
|
assert self.yolo_model.layers[-1].output_shape[-1] == num_anchors / len(
|
||||||
|
self.yolo_model.output
|
||||||
|
) * (
|
||||||
|
num_classes + 5
|
||||||
|
), "Mismatch between model and given anchor and class sizes"
|
||||||
|
|
||||||
|
end = timer()
|
||||||
|
print(
|
||||||
|
"{} model, anchors, and classes loaded in {:.2f}sec.".format(
|
||||||
|
model_path, end - start
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Generate colors for drawing bounding boxes.
|
||||||
|
if len(self.class_names) == 1:
|
||||||
|
self.colors = ["GreenYellow"]
|
||||||
|
else:
|
||||||
|
hsv_tuples = [
|
||||||
|
(x / len(self.class_names), 1.0, 1.0)
|
||||||
|
for x in range(len(self.class_names))
|
||||||
|
]
|
||||||
|
self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
|
||||||
|
self.colors = list(
|
||||||
|
map(
|
||||||
|
lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),
|
||||||
|
self.colors,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
np.random.seed(10101) # Fixed seed for consistent colors across runs.
|
||||||
|
np.random.shuffle(
|
||||||
|
self.colors
|
||||||
|
) # Shuffle colors to decorrelate adjacent classes.
|
||||||
|
np.random.seed(None) # Reset seed to default.
|
||||||
|
|
||||||
|
# Generate output tensor targets for filtered bounding boxes.
|
||||||
|
self.input_image_shape = K.placeholder(shape=(2,))
|
||||||
|
if self.gpu_num >= 2:
|
||||||
|
self.yolo_model = multi_gpu_model(self.yolo_model, gpus=self.gpu_num)
|
||||||
|
boxes, scores, classes = yolo_eval(
|
||||||
|
self.yolo_model.output,
|
||||||
|
self.anchors,
|
||||||
|
len(self.class_names),
|
||||||
|
self.input_image_shape,
|
||||||
|
score_threshold=self.score,
|
||||||
|
iou_threshold=self.iou,
|
||||||
|
)
|
||||||
|
return boxes, scores, classes
|
||||||
|
|
||||||
|
def detect_image(self, image, show_stats=True):
|
||||||
|
start = timer()
|
||||||
|
|
||||||
|
if self.model_image_size != (None, None):
|
||||||
|
assert self.model_image_size[0] % 32 == 0, "Multiples of 32 required"
|
||||||
|
assert self.model_image_size[1] % 32 == 0, "Multiples of 32 required"
|
||||||
|
boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))
|
||||||
|
else:
|
||||||
|
new_image_size = (
|
||||||
|
image.width - (image.width % 32),
|
||||||
|
image.height - (image.height % 32),
|
||||||
|
)
|
||||||
|
boxed_image = letterbox_image(image, new_image_size)
|
||||||
|
image_data = np.array(boxed_image, dtype="float32")
|
||||||
|
if show_stats:
|
||||||
|
print(image_data.shape)
|
||||||
|
image_data /= 255.0
|
||||||
|
image_data = np.expand_dims(image_data, 0) # Add batch dimension.
|
||||||
|
|
||||||
|
out_boxes, out_scores, out_classes = self.sess.run(
|
||||||
|
[self.boxes, self.scores, self.classes],
|
||||||
|
feed_dict={
|
||||||
|
self.yolo_model.input: image_data,
|
||||||
|
self.input_image_shape: [image.size[1], image.size[0]],
|
||||||
|
K.learning_phase(): 0,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
if show_stats:
|
||||||
|
print("Found {} boxes for {}".format(len(out_boxes), "img"))
|
||||||
|
out_prediction = []
|
||||||
|
|
||||||
|
font_path = os.path.join(os.path.dirname(__file__), "font/FiraMono-Medium.otf")
|
||||||
|
font = ImageFont.truetype(
|
||||||
|
font=font_path, size=np.floor(3e-2 * image.size[1] + 0.5).astype("int32")
|
||||||
|
)
|
||||||
|
thickness = (image.size[0] + image.size[1]) // 300
|
||||||
|
|
||||||
|
for i, c in reversed(list(enumerate(out_classes))):
|
||||||
|
predicted_class = self.class_names[c]
|
||||||
|
box = out_boxes[i]
|
||||||
|
score = out_scores[i]
|
||||||
|
|
||||||
|
label = "{} {:.2f}".format(predicted_class, score)
|
||||||
|
draw = ImageDraw.Draw(image)
|
||||||
|
label_size = draw.textsize(label, font)
|
||||||
|
|
||||||
|
top, left, bottom, right = box
|
||||||
|
top = max(0, np.floor(top + 0.5).astype("int32"))
|
||||||
|
left = max(0, np.floor(left + 0.5).astype("int32"))
|
||||||
|
bottom = min(image.size[1], np.floor(bottom + 0.5).astype("int32"))
|
||||||
|
right = min(image.size[0], np.floor(right + 0.5).astype("int32"))
|
||||||
|
|
||||||
|
# image was expanded to model_image_size: make sure it did not pick
|
||||||
|
# up any box outside of original image (run into this bug when
|
||||||
|
# lowering confidence threshold to 0.01)
|
||||||
|
if top > image.size[1] or right > image.size[0]:
|
||||||
|
continue
|
||||||
|
if show_stats:
|
||||||
|
print(label, (left, top), (right, bottom))
|
||||||
|
|
||||||
|
# output as xmin, ymin, xmax, ymax, class_index, confidence
|
||||||
|
out_prediction.append([left, top, right, bottom, c, score])
|
||||||
|
|
||||||
|
if top - label_size[1] >= 0:
|
||||||
|
text_origin = np.array([left, top - label_size[1]])
|
||||||
|
else:
|
||||||
|
text_origin = np.array([left, bottom])
|
||||||
|
|
||||||
|
# My kingdom for a good redistributable image drawing library.
|
||||||
|
for i in range(thickness):
|
||||||
|
draw.rectangle(
|
||||||
|
[left + i, top + i, right - i, bottom - i], outline=self.colors[c]
|
||||||
|
)
|
||||||
|
draw.rectangle(
|
||||||
|
[tuple(text_origin), tuple(text_origin + label_size)],
|
||||||
|
fill=self.colors[c],
|
||||||
|
)
|
||||||
|
|
||||||
|
draw.text(text_origin, label, fill=(0, 0, 0), font=font)
|
||||||
|
del draw
|
||||||
|
|
||||||
|
end = timer()
|
||||||
|
if show_stats:
|
||||||
|
print("Time spent: {:.3f}sec".format(end - start))
|
||||||
|
return out_prediction, image
|
||||||
|
|
||||||
|
def close_session(self):
|
||||||
|
self.sess.close()
|
||||||
|
|
||||||
|
|
||||||
|
def detect_video(yolo, video_path, output_path=""):
|
||||||
|
import cv2
|
||||||
|
|
||||||
|
vid = cv2.VideoCapture(video_path)
|
||||||
|
if not vid.isOpened():
|
||||||
|
raise IOError("Couldn't open webcam or video")
|
||||||
|
video_FourCC = cv2.VideoWriter_fourcc(*"mp4v") # int(vid.get(cv2.CAP_PROP_FOURCC))
|
||||||
|
video_fps = vid.get(cv2.CAP_PROP_FPS)
|
||||||
|
video_size = (
|
||||||
|
int(vid.get(cv2.CAP_PROP_FRAME_WIDTH)),
|
||||||
|
int(vid.get(cv2.CAP_PROP_FRAME_HEIGHT)),
|
||||||
|
)
|
||||||
|
isOutput = True if output_path != "" else False
|
||||||
|
if isOutput:
|
||||||
|
print(
|
||||||
|
"Processing {} with frame size {} at {:.1f} FPS".format(
|
||||||
|
os.path.basename(video_path), video_size, video_fps
|
||||||
|
)
|
||||||
|
)
|
||||||
|
# print("!!! TYPE:", type(output_path), type(video_FourCC), type(video_fps), type(video_size))
|
||||||
|
out = cv2.VideoWriter(output_path, video_FourCC, video_fps, video_size)
|
||||||
|
accum_time = 0
|
||||||
|
curr_fps = 0
|
||||||
|
fps = "FPS: ??"
|
||||||
|
prev_time = timer()
|
||||||
|
while vid.isOpened():
|
||||||
|
return_value, frame = vid.read()
|
||||||
|
if not return_value:
|
||||||
|
break
|
||||||
|
# opencv images are BGR, translate to RGB
|
||||||
|
frame = frame[:, :, ::-1]
|
||||||
|
image = Image.fromarray(frame)
|
||||||
|
out_pred, image = yolo.detect_image(image, show_stats=False)
|
||||||
|
result = np.asarray(image)
|
||||||
|
curr_time = timer()
|
||||||
|
exec_time = curr_time - prev_time
|
||||||
|
prev_time = curr_time
|
||||||
|
accum_time = accum_time + exec_time
|
||||||
|
curr_fps = curr_fps + 1
|
||||||
|
if accum_time > 1:
|
||||||
|
accum_time = accum_time - 1
|
||||||
|
fps = "FPS: " + str(curr_fps)
|
||||||
|
curr_fps = 0
|
||||||
|
cv2.putText(
|
||||||
|
result,
|
||||||
|
text=fps,
|
||||||
|
org=(3, 15),
|
||||||
|
fontFace=cv2.FONT_HERSHEY_SIMPLEX,
|
||||||
|
fontScale=0.50,
|
||||||
|
color=(255, 0, 0),
|
||||||
|
thickness=2,
|
||||||
|
)
|
||||||
|
# cv2.namedWindow("result", cv2.WINDOW_NORMAL)
|
||||||
|
# cv2.imshow("result", result)
|
||||||
|
if isOutput:
|
||||||
|
out.write(result[:, :, ::-1])
|
||||||
|
# if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||||
|
# break
|
||||||
|
vid.release()
|
||||||
|
out.release()
|
||||||
|
# yolo.close_session()
|
||||||
@@ -0,0 +1,521 @@
|
|||||||
|
"""YOLO_v3 Model Defined in Keras."""
|
||||||
|
|
||||||
|
from functools import wraps
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import tensorflow as tf
|
||||||
|
from keras import backend as K
|
||||||
|
from keras.layers import (
|
||||||
|
Conv2D,
|
||||||
|
Add,
|
||||||
|
ZeroPadding2D,
|
||||||
|
UpSampling2D,
|
||||||
|
Concatenate,
|
||||||
|
MaxPooling2D,
|
||||||
|
)
|
||||||
|
from keras.layers.advanced_activations import LeakyReLU
|
||||||
|
from keras.layers.normalization import BatchNormalization
|
||||||
|
from keras.models import Model
|
||||||
|
from keras.regularizers import l2
|
||||||
|
|
||||||
|
from ..yolo3.utils import compose
|
||||||
|
|
||||||
|
|
||||||
|
@wraps(Conv2D)
|
||||||
|
def DarknetConv2D(*args, **kwargs):
|
||||||
|
"""Wrapper to set Darknet parameters for Convolution2D."""
|
||||||
|
darknet_conv_kwargs = {"kernel_regularizer": l2(5e-4)}
|
||||||
|
darknet_conv_kwargs["padding"] = (
|
||||||
|
"valid" if kwargs.get("strides") == (2, 2) else "same"
|
||||||
|
)
|
||||||
|
darknet_conv_kwargs.update(kwargs)
|
||||||
|
return Conv2D(*args, **darknet_conv_kwargs)
|
||||||
|
|
||||||
|
|
||||||
|
def DarknetConv2D_BN_Leaky(*args, **kwargs):
|
||||||
|
"""Darknet Convolution2D followed by BatchNormalization and LeakyReLU."""
|
||||||
|
no_bias_kwargs = {"use_bias": False}
|
||||||
|
no_bias_kwargs.update(kwargs)
|
||||||
|
return compose(
|
||||||
|
DarknetConv2D(*args, **no_bias_kwargs),
|
||||||
|
BatchNormalization(),
|
||||||
|
LeakyReLU(alpha=0.1),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def resblock_body(x, num_filters, num_blocks):
|
||||||
|
"""A series of resblocks starting with a downsampling Convolution2D"""
|
||||||
|
# Darknet uses left and top padding instead of 'same' mode
|
||||||
|
x = ZeroPadding2D(((1, 0), (1, 0)))(x)
|
||||||
|
x = DarknetConv2D_BN_Leaky(num_filters, (3, 3), strides=(2, 2))(x)
|
||||||
|
for i in range(num_blocks):
|
||||||
|
y = compose(
|
||||||
|
DarknetConv2D_BN_Leaky(num_filters // 2, (1, 1)),
|
||||||
|
DarknetConv2D_BN_Leaky(num_filters, (3, 3)),
|
||||||
|
)(x)
|
||||||
|
x = Add()([x, y])
|
||||||
|
return x
|
||||||
|
|
||||||
|
|
||||||
|
def darknet_body(x):
|
||||||
|
"""Darknent body having 52 Convolution2D layers"""
|
||||||
|
x = DarknetConv2D_BN_Leaky(32, (3, 3))(x)
|
||||||
|
x = resblock_body(x, 64, 1)
|
||||||
|
x = resblock_body(x, 128, 2)
|
||||||
|
x = resblock_body(x, 256, 8)
|
||||||
|
x = resblock_body(x, 512, 8)
|
||||||
|
x = resblock_body(x, 1024, 4)
|
||||||
|
return x
|
||||||
|
|
||||||
|
|
||||||
|
def make_last_layers(x, num_filters, out_filters):
|
||||||
|
"""6 Conv2D_BN_Leaky layers followed by a Conv2D_linear layer"""
|
||||||
|
x = compose(
|
||||||
|
DarknetConv2D_BN_Leaky(num_filters, (1, 1)),
|
||||||
|
DarknetConv2D_BN_Leaky(num_filters * 2, (3, 3)),
|
||||||
|
DarknetConv2D_BN_Leaky(num_filters, (1, 1)),
|
||||||
|
DarknetConv2D_BN_Leaky(num_filters * 2, (3, 3)),
|
||||||
|
DarknetConv2D_BN_Leaky(num_filters, (1, 1)),
|
||||||
|
)(x)
|
||||||
|
y = compose(
|
||||||
|
DarknetConv2D_BN_Leaky(num_filters * 2, (3, 3)),
|
||||||
|
DarknetConv2D(out_filters, (1, 1)),
|
||||||
|
)(x)
|
||||||
|
return x, y
|
||||||
|
|
||||||
|
|
||||||
|
def yolo_body(inputs, num_anchors, num_classes):
|
||||||
|
"""Create YOLO_V3 model CNN body in Keras."""
|
||||||
|
darknet = Model(inputs, darknet_body(inputs))
|
||||||
|
x, y1 = make_last_layers(darknet.output, 512, num_anchors * (num_classes + 5))
|
||||||
|
|
||||||
|
x = compose(DarknetConv2D_BN_Leaky(256, (1, 1)), UpSampling2D(2))(x)
|
||||||
|
x = Concatenate()([x, darknet.layers[152].output])
|
||||||
|
x, y2 = make_last_layers(x, 256, num_anchors * (num_classes + 5))
|
||||||
|
|
||||||
|
x = compose(DarknetConv2D_BN_Leaky(128, (1, 1)), UpSampling2D(2))(x)
|
||||||
|
x = Concatenate()([x, darknet.layers[92].output])
|
||||||
|
x, y3 = make_last_layers(x, 128, num_anchors * (num_classes + 5))
|
||||||
|
|
||||||
|
return Model(inputs, [y1, y2, y3])
|
||||||
|
|
||||||
|
|
||||||
|
def tiny_yolo_body(inputs, num_anchors, num_classes):
|
||||||
|
"""Create Tiny YOLO_v3 model CNN body in keras."""
|
||||||
|
x1 = compose(
|
||||||
|
DarknetConv2D_BN_Leaky(16, (3, 3)),
|
||||||
|
MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding="same"),
|
||||||
|
DarknetConv2D_BN_Leaky(32, (3, 3)),
|
||||||
|
MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding="same"),
|
||||||
|
DarknetConv2D_BN_Leaky(64, (3, 3)),
|
||||||
|
MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding="same"),
|
||||||
|
DarknetConv2D_BN_Leaky(128, (3, 3)),
|
||||||
|
MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding="same"),
|
||||||
|
DarknetConv2D_BN_Leaky(256, (3, 3)),
|
||||||
|
)(inputs)
|
||||||
|
x2 = compose(
|
||||||
|
MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding="same"),
|
||||||
|
DarknetConv2D_BN_Leaky(512, (3, 3)),
|
||||||
|
MaxPooling2D(pool_size=(2, 2), strides=(1, 1), padding="same"),
|
||||||
|
DarknetConv2D_BN_Leaky(1024, (3, 3)),
|
||||||
|
DarknetConv2D_BN_Leaky(256, (1, 1)),
|
||||||
|
)(x1)
|
||||||
|
y1 = compose(
|
||||||
|
DarknetConv2D_BN_Leaky(512, (3, 3)),
|
||||||
|
DarknetConv2D(num_anchors * (num_classes + 5), (1, 1)),
|
||||||
|
)(x2)
|
||||||
|
|
||||||
|
x2 = compose(DarknetConv2D_BN_Leaky(128, (1, 1)), UpSampling2D(2))(x2)
|
||||||
|
y2 = compose(
|
||||||
|
Concatenate(),
|
||||||
|
DarknetConv2D_BN_Leaky(256, (3, 3)),
|
||||||
|
DarknetConv2D(num_anchors * (num_classes + 5), (1, 1)),
|
||||||
|
)([x2, x1])
|
||||||
|
|
||||||
|
return Model(inputs, [y1, y2])
|
||||||
|
|
||||||
|
|
||||||
|
def yolo_head(feats, anchors, num_classes, input_shape, calc_loss=False):
|
||||||
|
"""Convert final layer features to bounding box parameters."""
|
||||||
|
num_anchors = len(anchors)
|
||||||
|
# Reshape to batch, height, width, num_anchors, box_params.
|
||||||
|
anchors_tensor = K.reshape(K.constant(anchors), [1, 1, 1, num_anchors, 2])
|
||||||
|
|
||||||
|
grid_shape = K.shape(feats)[1:3] # height, width
|
||||||
|
grid_y = K.tile(
|
||||||
|
K.reshape(K.arange(0, stop=grid_shape[0]), [-1, 1, 1, 1]),
|
||||||
|
[1, grid_shape[1], 1, 1],
|
||||||
|
)
|
||||||
|
grid_x = K.tile(
|
||||||
|
K.reshape(K.arange(0, stop=grid_shape[1]), [1, -1, 1, 1]),
|
||||||
|
[grid_shape[0], 1, 1, 1],
|
||||||
|
)
|
||||||
|
grid = K.concatenate([grid_x, grid_y])
|
||||||
|
grid = K.cast(grid, K.dtype(feats))
|
||||||
|
|
||||||
|
feats = K.reshape(
|
||||||
|
feats, [-1, grid_shape[0], grid_shape[1], num_anchors, num_classes + 5]
|
||||||
|
)
|
||||||
|
|
||||||
|
# Adjust preditions to each spatial grid point and anchor size.
|
||||||
|
box_xy = (K.sigmoid(feats[..., :2]) + grid) / K.cast(
|
||||||
|
grid_shape[::-1], K.dtype(feats)
|
||||||
|
)
|
||||||
|
box_wh = (
|
||||||
|
K.exp(feats[..., 2:4])
|
||||||
|
* anchors_tensor
|
||||||
|
/ K.cast(input_shape[::-1], K.dtype(feats))
|
||||||
|
)
|
||||||
|
box_confidence = K.sigmoid(feats[..., 4:5])
|
||||||
|
box_class_probs = K.sigmoid(feats[..., 5:])
|
||||||
|
|
||||||
|
if calc_loss == True:
|
||||||
|
return grid, feats, box_xy, box_wh
|
||||||
|
return box_xy, box_wh, box_confidence, box_class_probs
|
||||||
|
|
||||||
|
|
||||||
|
def yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape):
|
||||||
|
"""Get corrected boxes"""
|
||||||
|
box_yx = box_xy[..., ::-1]
|
||||||
|
box_hw = box_wh[..., ::-1]
|
||||||
|
input_shape = K.cast(input_shape, K.dtype(box_yx))
|
||||||
|
image_shape = K.cast(image_shape, K.dtype(box_yx))
|
||||||
|
new_shape = K.round(image_shape * K.min(input_shape / image_shape))
|
||||||
|
offset = (input_shape - new_shape) / 2.0 / input_shape
|
||||||
|
scale = input_shape / new_shape
|
||||||
|
box_yx = (box_yx - offset) * scale
|
||||||
|
box_hw *= scale
|
||||||
|
|
||||||
|
box_mins = box_yx - (box_hw / 2.0)
|
||||||
|
box_maxes = box_yx + (box_hw / 2.0)
|
||||||
|
boxes = K.concatenate(
|
||||||
|
[
|
||||||
|
box_mins[..., 0:1], # y_min
|
||||||
|
box_mins[..., 1:2], # x_min
|
||||||
|
box_maxes[..., 0:1], # y_max
|
||||||
|
box_maxes[..., 1:2], # x_max
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
# Scale boxes back to original image shape.
|
||||||
|
boxes *= K.concatenate([image_shape, image_shape])
|
||||||
|
return boxes
|
||||||
|
|
||||||
|
|
||||||
|
def yolo_boxes_and_scores(feats, anchors, num_classes, input_shape, image_shape):
|
||||||
|
"""Process Conv layer output"""
|
||||||
|
box_xy, box_wh, box_confidence, box_class_probs = yolo_head(
|
||||||
|
feats, anchors, num_classes, input_shape
|
||||||
|
)
|
||||||
|
boxes = yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape)
|
||||||
|
boxes = K.reshape(boxes, [-1, 4])
|
||||||
|
box_scores = box_confidence * box_class_probs
|
||||||
|
box_scores = K.reshape(box_scores, [-1, num_classes])
|
||||||
|
return boxes, box_scores
|
||||||
|
|
||||||
|
|
||||||
|
def yolo_eval(
|
||||||
|
yolo_outputs,
|
||||||
|
anchors,
|
||||||
|
num_classes,
|
||||||
|
image_shape,
|
||||||
|
max_boxes=20,
|
||||||
|
score_threshold=0.6,
|
||||||
|
iou_threshold=0.5,
|
||||||
|
):
|
||||||
|
"""Evaluate YOLO model on given input and return filtered boxes."""
|
||||||
|
num_layers = len(yolo_outputs)
|
||||||
|
anchor_mask = (
|
||||||
|
[[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]]
|
||||||
|
) # default setting
|
||||||
|
input_shape = K.shape(yolo_outputs[0])[1:3] * 32
|
||||||
|
boxes = []
|
||||||
|
box_scores = []
|
||||||
|
for l in range(num_layers):
|
||||||
|
_boxes, _box_scores = yolo_boxes_and_scores(
|
||||||
|
yolo_outputs[l],
|
||||||
|
anchors[anchor_mask[l]],
|
||||||
|
num_classes,
|
||||||
|
input_shape,
|
||||||
|
image_shape,
|
||||||
|
)
|
||||||
|
boxes.append(_boxes)
|
||||||
|
box_scores.append(_box_scores)
|
||||||
|
boxes = K.concatenate(boxes, axis=0)
|
||||||
|
box_scores = K.concatenate(box_scores, axis=0)
|
||||||
|
|
||||||
|
mask = box_scores >= score_threshold
|
||||||
|
max_boxes_tensor = K.constant(max_boxes, dtype="int32")
|
||||||
|
boxes_ = []
|
||||||
|
scores_ = []
|
||||||
|
classes_ = []
|
||||||
|
for c in range(num_classes):
|
||||||
|
# TODO: use keras backend instead of tf.
|
||||||
|
class_boxes = tf.boolean_mask(boxes, mask[:, c])
|
||||||
|
class_box_scores = tf.boolean_mask(box_scores[:, c], mask[:, c])
|
||||||
|
nms_index = tf.image.non_max_suppression(
|
||||||
|
class_boxes, class_box_scores, max_boxes_tensor, iou_threshold=iou_threshold
|
||||||
|
)
|
||||||
|
class_boxes = K.gather(class_boxes, nms_index)
|
||||||
|
class_box_scores = K.gather(class_box_scores, nms_index)
|
||||||
|
classes = K.ones_like(class_box_scores, "int32") * c
|
||||||
|
boxes_.append(class_boxes)
|
||||||
|
scores_.append(class_box_scores)
|
||||||
|
classes_.append(classes)
|
||||||
|
boxes_ = K.concatenate(boxes_, axis=0)
|
||||||
|
scores_ = K.concatenate(scores_, axis=0)
|
||||||
|
classes_ = K.concatenate(classes_, axis=0)
|
||||||
|
|
||||||
|
return boxes_, scores_, classes_
|
||||||
|
|
||||||
|
|
||||||
|
def preprocess_true_boxes(true_boxes, input_shape, anchors, num_classes):
|
||||||
|
"""Preprocess true boxes to training input format
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
true_boxes: array, shape=(m, T, 5)
|
||||||
|
Absolute x_min, y_min, x_max, y_max, class_id relative to input_shape.
|
||||||
|
input_shape: array-like, hw, multiples of 32
|
||||||
|
anchors: array, shape=(N, 2), wh
|
||||||
|
num_classes: integer
|
||||||
|
|
||||||
|
Returns
|
||||||
|
-------
|
||||||
|
y_true: list of array, shape like yolo_outputs, xywh are reletive value
|
||||||
|
|
||||||
|
"""
|
||||||
|
assert (
|
||||||
|
true_boxes[..., 4] < num_classes
|
||||||
|
).all(), "class id must be less than num_classes"
|
||||||
|
num_layers = len(anchors) // 3 # default setting
|
||||||
|
anchor_mask = (
|
||||||
|
[[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]]
|
||||||
|
)
|
||||||
|
|
||||||
|
true_boxes = np.array(true_boxes, dtype="float32")
|
||||||
|
input_shape = np.array(input_shape, dtype="int32")
|
||||||
|
boxes_xy = (true_boxes[..., 0:2] + true_boxes[..., 2:4]) // 2
|
||||||
|
boxes_wh = true_boxes[..., 2:4] - true_boxes[..., 0:2]
|
||||||
|
true_boxes[..., 0:2] = boxes_xy / input_shape[::-1]
|
||||||
|
true_boxes[..., 2:4] = boxes_wh / input_shape[::-1]
|
||||||
|
|
||||||
|
m = true_boxes.shape[0]
|
||||||
|
grid_shapes = [input_shape // {0: 32, 1: 16, 2: 8}[l] for l in range(num_layers)]
|
||||||
|
y_true = [
|
||||||
|
np.zeros(
|
||||||
|
(
|
||||||
|
m,
|
||||||
|
grid_shapes[l][0],
|
||||||
|
grid_shapes[l][1],
|
||||||
|
len(anchor_mask[l]),
|
||||||
|
5 + num_classes,
|
||||||
|
),
|
||||||
|
dtype="float32",
|
||||||
|
)
|
||||||
|
for l in range(num_layers)
|
||||||
|
]
|
||||||
|
|
||||||
|
# Expand dim to apply broadcasting.
|
||||||
|
anchors = np.expand_dims(anchors, 0)
|
||||||
|
anchor_maxes = anchors / 2.0
|
||||||
|
anchor_mins = -anchor_maxes
|
||||||
|
valid_mask = boxes_wh[..., 0] > 0
|
||||||
|
|
||||||
|
for b in range(m):
|
||||||
|
# Discard zero rows.
|
||||||
|
wh = boxes_wh[b, valid_mask[b]]
|
||||||
|
if len(wh) == 0:
|
||||||
|
continue
|
||||||
|
# Expand dim to apply broadcasting.
|
||||||
|
wh = np.expand_dims(wh, -2)
|
||||||
|
box_maxes = wh / 2.0
|
||||||
|
box_mins = -box_maxes
|
||||||
|
|
||||||
|
intersect_mins = np.maximum(box_mins, anchor_mins)
|
||||||
|
intersect_maxes = np.minimum(box_maxes, anchor_maxes)
|
||||||
|
intersect_wh = np.maximum(intersect_maxes - intersect_mins, 0.0)
|
||||||
|
intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1]
|
||||||
|
box_area = wh[..., 0] * wh[..., 1]
|
||||||
|
anchor_area = anchors[..., 0] * anchors[..., 1]
|
||||||
|
iou = intersect_area / (box_area + anchor_area - intersect_area)
|
||||||
|
|
||||||
|
# Find best anchor for each true box
|
||||||
|
best_anchor = np.argmax(iou, axis=-1)
|
||||||
|
|
||||||
|
for t, n in enumerate(best_anchor):
|
||||||
|
for l in range(num_layers):
|
||||||
|
if n in anchor_mask[l]:
|
||||||
|
i = np.floor(true_boxes[b, t, 0] * grid_shapes[l][1]).astype(
|
||||||
|
"int32"
|
||||||
|
)
|
||||||
|
j = np.floor(true_boxes[b, t, 1] * grid_shapes[l][0]).astype(
|
||||||
|
"int32"
|
||||||
|
)
|
||||||
|
k = anchor_mask[l].index(n)
|
||||||
|
c = true_boxes[b, t, 4].astype("int32")
|
||||||
|
y_true[l][b, j, i, k, 0:4] = true_boxes[b, t, 0:4]
|
||||||
|
y_true[l][b, j, i, k, 4] = 1
|
||||||
|
y_true[l][b, j, i, k, 5 + c] = 1
|
||||||
|
|
||||||
|
return y_true
|
||||||
|
|
||||||
|
|
||||||
|
def box_iou(b1, b2):
|
||||||
|
"""Return iou tensor
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
b1: tensor, shape=(i1,...,iN, 4), xywh
|
||||||
|
b2: tensor, shape=(j, 4), xywh
|
||||||
|
|
||||||
|
Returns
|
||||||
|
-------
|
||||||
|
iou: tensor, shape=(i1,...,iN, j)
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Expand dim to apply broadcasting.
|
||||||
|
b1 = K.expand_dims(b1, -2)
|
||||||
|
b1_xy = b1[..., :2]
|
||||||
|
b1_wh = b1[..., 2:4]
|
||||||
|
b1_wh_half = b1_wh / 2.0
|
||||||
|
b1_mins = b1_xy - b1_wh_half
|
||||||
|
b1_maxes = b1_xy + b1_wh_half
|
||||||
|
|
||||||
|
# Expand dim to apply broadcasting.
|
||||||
|
b2 = K.expand_dims(b2, 0)
|
||||||
|
b2_xy = b2[..., :2]
|
||||||
|
b2_wh = b2[..., 2:4]
|
||||||
|
b2_wh_half = b2_wh / 2.0
|
||||||
|
b2_mins = b2_xy - b2_wh_half
|
||||||
|
b2_maxes = b2_xy + b2_wh_half
|
||||||
|
|
||||||
|
intersect_mins = K.maximum(b1_mins, b2_mins)
|
||||||
|
intersect_maxes = K.minimum(b1_maxes, b2_maxes)
|
||||||
|
intersect_wh = K.maximum(intersect_maxes - intersect_mins, 0.0)
|
||||||
|
intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1]
|
||||||
|
b1_area = b1_wh[..., 0] * b1_wh[..., 1]
|
||||||
|
b2_area = b2_wh[..., 0] * b2_wh[..., 1]
|
||||||
|
iou = intersect_area / (b1_area + b2_area - intersect_area)
|
||||||
|
|
||||||
|
return iou
|
||||||
|
|
||||||
|
|
||||||
|
def yolo_loss(args, anchors, num_classes, ignore_thresh=0.5, print_loss=False):
|
||||||
|
"""Return yolo_loss tensor
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
yolo_outputs: list of tensor, the output of yolo_body or tiny_yolo_body
|
||||||
|
y_true: list of array, the output of preprocess_true_boxes
|
||||||
|
anchors: array, shape=(N, 2), wh
|
||||||
|
num_classes: integer
|
||||||
|
ignore_thresh: float, the iou threshold whether to ignore object confidence loss
|
||||||
|
|
||||||
|
Returns
|
||||||
|
-------
|
||||||
|
loss: tensor, shape=(1,)
|
||||||
|
|
||||||
|
"""
|
||||||
|
num_layers = len(anchors) // 3 # default setting
|
||||||
|
yolo_outputs = args[:num_layers]
|
||||||
|
y_true = args[num_layers:]
|
||||||
|
anchor_mask = (
|
||||||
|
[[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]]
|
||||||
|
)
|
||||||
|
input_shape = K.cast(K.shape(yolo_outputs[0])[1:3] * 32, K.dtype(y_true[0]))
|
||||||
|
grid_shapes = [
|
||||||
|
K.cast(K.shape(yolo_outputs[l])[1:3], K.dtype(y_true[0]))
|
||||||
|
for l in range(num_layers)
|
||||||
|
]
|
||||||
|
loss = 0
|
||||||
|
m = K.shape(yolo_outputs[0])[0] # batch size, tensor
|
||||||
|
mf = K.cast(m, K.dtype(yolo_outputs[0]))
|
||||||
|
|
||||||
|
for l in range(num_layers):
|
||||||
|
object_mask = y_true[l][..., 4:5]
|
||||||
|
true_class_probs = y_true[l][..., 5:]
|
||||||
|
|
||||||
|
grid, raw_pred, pred_xy, pred_wh = yolo_head(
|
||||||
|
yolo_outputs[l],
|
||||||
|
anchors[anchor_mask[l]],
|
||||||
|
num_classes,
|
||||||
|
input_shape,
|
||||||
|
calc_loss=True,
|
||||||
|
)
|
||||||
|
pred_box = K.concatenate([pred_xy, pred_wh])
|
||||||
|
|
||||||
|
# Darknet raw box to calculate loss.
|
||||||
|
raw_true_xy = y_true[l][..., :2] * grid_shapes[l][::-1] - grid
|
||||||
|
raw_true_wh = K.log(
|
||||||
|
y_true[l][..., 2:4] / anchors[anchor_mask[l]] * input_shape[::-1]
|
||||||
|
)
|
||||||
|
raw_true_wh = K.switch(
|
||||||
|
object_mask, raw_true_wh, K.zeros_like(raw_true_wh)
|
||||||
|
) # avoid log(0)=-inf
|
||||||
|
box_loss_scale = 2 - y_true[l][..., 2:3] * y_true[l][..., 3:4]
|
||||||
|
|
||||||
|
# Find ignore mask, iterate over each of batch.
|
||||||
|
ignore_mask = tf.TensorArray(K.dtype(y_true[0]), size=1, dynamic_size=True)
|
||||||
|
object_mask_bool = K.cast(object_mask, "bool")
|
||||||
|
|
||||||
|
def loop_body(b, ignore_mask):
|
||||||
|
true_box = tf.boolean_mask(
|
||||||
|
y_true[l][b, ..., 0:4], object_mask_bool[b, ..., 0]
|
||||||
|
)
|
||||||
|
iou = box_iou(pred_box[b], true_box)
|
||||||
|
best_iou = K.max(iou, axis=-1)
|
||||||
|
ignore_mask = ignore_mask.write(
|
||||||
|
b, K.cast(best_iou < ignore_thresh, K.dtype(true_box))
|
||||||
|
)
|
||||||
|
return b + 1, ignore_mask
|
||||||
|
|
||||||
|
_, ignore_mask = K.control_flow_ops.while_loop(
|
||||||
|
lambda b, *args: b < m, loop_body, [0, ignore_mask]
|
||||||
|
)
|
||||||
|
ignore_mask = ignore_mask.stack()
|
||||||
|
ignore_mask = K.expand_dims(ignore_mask, -1)
|
||||||
|
|
||||||
|
# K.binary_crossentropy is helpful to avoid exp overflow.
|
||||||
|
xy_loss = (
|
||||||
|
object_mask
|
||||||
|
* box_loss_scale
|
||||||
|
* K.binary_crossentropy(raw_true_xy, raw_pred[..., 0:2], from_logits=True)
|
||||||
|
)
|
||||||
|
wh_loss = (
|
||||||
|
object_mask
|
||||||
|
* box_loss_scale
|
||||||
|
* 0.5
|
||||||
|
* K.square(raw_true_wh - raw_pred[..., 2:4])
|
||||||
|
)
|
||||||
|
confidence_loss = (
|
||||||
|
object_mask
|
||||||
|
* K.binary_crossentropy(object_mask, raw_pred[..., 4:5], from_logits=True)
|
||||||
|
+ (1 - object_mask)
|
||||||
|
* K.binary_crossentropy(object_mask, raw_pred[..., 4:5], from_logits=True)
|
||||||
|
* ignore_mask
|
||||||
|
)
|
||||||
|
class_loss = object_mask * K.binary_crossentropy(
|
||||||
|
true_class_probs, raw_pred[..., 5:], from_logits=True
|
||||||
|
)
|
||||||
|
|
||||||
|
xy_loss = K.sum(xy_loss) / mf
|
||||||
|
wh_loss = K.sum(wh_loss) / mf
|
||||||
|
confidence_loss = K.sum(confidence_loss) / mf
|
||||||
|
class_loss = K.sum(class_loss) / mf
|
||||||
|
loss += xy_loss + wh_loss + confidence_loss + class_loss
|
||||||
|
if print_loss:
|
||||||
|
loss = tf.Print(
|
||||||
|
loss,
|
||||||
|
[
|
||||||
|
loss,
|
||||||
|
xy_loss,
|
||||||
|
wh_loss,
|
||||||
|
confidence_loss,
|
||||||
|
class_loss,
|
||||||
|
K.sum(ignore_mask),
|
||||||
|
],
|
||||||
|
message="loss: ",
|
||||||
|
)
|
||||||
|
return loss
|
||||||
@@ -0,0 +1,165 @@
|
|||||||
|
"""Miscellaneous utility functions."""
|
||||||
|
|
||||||
|
from functools import reduce
|
||||||
|
|
||||||
|
from PIL import Image
|
||||||
|
|
||||||
|
# import Image
|
||||||
|
import numpy as np
|
||||||
|
from matplotlib.colors import rgb_to_hsv, hsv_to_rgb
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
|
||||||
|
def compose(*funcs):
|
||||||
|
"""Compose arbitrarily many functions, evaluated left to right.
|
||||||
|
|
||||||
|
Reference: https://mathieularose.com/function-composition-in-python/
|
||||||
|
"""
|
||||||
|
# return lambda x: reduce(lambda v, f: f(v), funcs, x)
|
||||||
|
if funcs:
|
||||||
|
return reduce(lambda f, g: lambda *a, **kw: g(f(*a, **kw)), funcs)
|
||||||
|
else:
|
||||||
|
raise ValueError("Composition of empty sequence not supported.")
|
||||||
|
|
||||||
|
|
||||||
|
def letterbox_image(image, size):
|
||||||
|
"""resize image with unchanged aspect ratio using padding"""
|
||||||
|
iw, ih = image.size
|
||||||
|
w, h = size
|
||||||
|
scale = min(w / iw, h / ih)
|
||||||
|
nw = int(iw * scale)
|
||||||
|
nh = int(ih * scale)
|
||||||
|
|
||||||
|
image = image.resize((nw, nh), Image.BICUBIC)
|
||||||
|
new_image = Image.new("RGB", size, (128, 128, 128))
|
||||||
|
new_image.paste(image, ((w - nw) // 2, (h - nh) // 2))
|
||||||
|
return new_image
|
||||||
|
|
||||||
|
|
||||||
|
def rand(a=0, b=1):
|
||||||
|
return np.random.rand() * (b - a) + a
|
||||||
|
|
||||||
|
|
||||||
|
def get_random_data(
|
||||||
|
annotation_line,
|
||||||
|
input_shape,
|
||||||
|
random=True,
|
||||||
|
max_boxes=20,
|
||||||
|
jitter=0.3,
|
||||||
|
hue=0.1,
|
||||||
|
sat=1.5,
|
||||||
|
val=1.5,
|
||||||
|
proc_img=True,
|
||||||
|
):
|
||||||
|
"""random preprocessing for real-time data augmentation"""
|
||||||
|
|
||||||
|
# This type of splitting makes sure that it is compatible with spaces in folder names
|
||||||
|
# We split at the first space that is followed by a number
|
||||||
|
tmp_split = re.split("( \d)", annotation_line, maxsplit=1)
|
||||||
|
if len(tmp_split) > 2:
|
||||||
|
line = tmp_split[0], tmp_split[1] + tmp_split[2]
|
||||||
|
else:
|
||||||
|
line = tmp_split
|
||||||
|
# line[0] contains the filename
|
||||||
|
image = Image.open(line[0])
|
||||||
|
# The rest of the line includes bounding boxes
|
||||||
|
line = line[1].split(" ")
|
||||||
|
iw, ih = image.size
|
||||||
|
h, w = input_shape
|
||||||
|
box = np.array([np.array(list(map(int, box.split(",")))) for box in line[1:]])
|
||||||
|
|
||||||
|
if not random:
|
||||||
|
# resize image
|
||||||
|
scale = min(w / iw, h / ih)
|
||||||
|
nw = int(iw * scale)
|
||||||
|
nh = int(ih * scale)
|
||||||
|
dx = (w - nw) // 2
|
||||||
|
dy = (h - nh) // 2
|
||||||
|
image_data = 0
|
||||||
|
if proc_img:
|
||||||
|
image = image.resize((nw, nh), Image.BICUBIC)
|
||||||
|
new_image = Image.new("RGB", (w, h), (128, 128, 128))
|
||||||
|
new_image.paste(image, (dx, dy))
|
||||||
|
image_data = np.array(new_image) / 255.0
|
||||||
|
|
||||||
|
# correct boxes
|
||||||
|
box_data = np.zeros((max_boxes, 5))
|
||||||
|
if len(box) > 0:
|
||||||
|
np.random.shuffle(box)
|
||||||
|
if len(box) > max_boxes:
|
||||||
|
box = box[:max_boxes]
|
||||||
|
box[:, [0, 2]] = box[:, [0, 2]] * scale + dx
|
||||||
|
box[:, [1, 3]] = box[:, [1, 3]] * scale + dy
|
||||||
|
box_data[: len(box)] = box
|
||||||
|
|
||||||
|
return image_data, box_data
|
||||||
|
|
||||||
|
# resize image
|
||||||
|
new_ar = w / h * rand(1 - jitter, 1 + jitter) / rand(1 - jitter, 1 + jitter)
|
||||||
|
scale = rand(0.25, 2)
|
||||||
|
if new_ar < 1:
|
||||||
|
nh = int(scale * h)
|
||||||
|
nw = int(nh * new_ar)
|
||||||
|
else:
|
||||||
|
nw = int(scale * w)
|
||||||
|
nh = int(nw / new_ar)
|
||||||
|
image = image.resize((nw, nh), Image.BICUBIC)
|
||||||
|
|
||||||
|
# place image
|
||||||
|
dx = int(rand(0, w - nw))
|
||||||
|
dy = int(rand(0, h - nh))
|
||||||
|
new_image = Image.new("RGB", (w, h), (128, 128, 128))
|
||||||
|
new_image.paste(image, (dx, dy))
|
||||||
|
image = new_image
|
||||||
|
|
||||||
|
# flip image or not
|
||||||
|
flip = rand() < 0.5
|
||||||
|
if flip:
|
||||||
|
image = image.transpose(Image.FLIP_LEFT_RIGHT)
|
||||||
|
|
||||||
|
# distort image
|
||||||
|
hue = rand(-hue, hue)
|
||||||
|
sat = rand(1, sat) if rand() < 0.5 else 1 / rand(1, sat)
|
||||||
|
val = rand(1, val) if rand() < 0.5 else 1 / rand(1, val)
|
||||||
|
x = rgb_to_hsv(np.array(image) / 255.0)
|
||||||
|
x[..., 0] += hue
|
||||||
|
x[..., 0][x[..., 0] > 1] -= 1
|
||||||
|
x[..., 0][x[..., 0] < 0] += 1
|
||||||
|
x[..., 1] *= sat
|
||||||
|
x[..., 2] *= val
|
||||||
|
x[x > 1] = 1
|
||||||
|
x[x < 0] = 0
|
||||||
|
image_data = hsv_to_rgb(x) # numpy array, 0 to 1
|
||||||
|
|
||||||
|
# make gray
|
||||||
|
gray = rand() < 0.2
|
||||||
|
if gray:
|
||||||
|
image_gray = np.dot(image_data, [0.299, 0.587, 0.114])
|
||||||
|
# a gray RGB image is GGG
|
||||||
|
image_data = np.moveaxis(np.stack([image_gray, image_gray, image_gray]), 0, -1)
|
||||||
|
|
||||||
|
# invert colors
|
||||||
|
invert = rand() < 0.1
|
||||||
|
if invert:
|
||||||
|
image_data = 1.0 - image_data
|
||||||
|
|
||||||
|
# correct boxes
|
||||||
|
box_data = np.zeros((max_boxes, 5))
|
||||||
|
if len(box) > 0:
|
||||||
|
np.random.shuffle(box)
|
||||||
|
box[:, [0, 2]] = box[:, [0, 2]] * nw / iw + dx
|
||||||
|
box[:, [1, 3]] = box[:, [1, 3]] * nh / ih + dy
|
||||||
|
if flip:
|
||||||
|
box[:, [0, 2]] = w - box[:, [2, 0]]
|
||||||
|
box[:, 0:2][box[:, 0:2] < 0] = 0
|
||||||
|
box[:, 2][box[:, 2] > w] = w
|
||||||
|
box[:, 3][box[:, 3] > h] = h
|
||||||
|
box_w = box[:, 2] - box[:, 0]
|
||||||
|
box_h = box[:, 3] - box[:, 1]
|
||||||
|
box = box[np.logical_and(box_w > 1, box_h > 1)] # discard invalid box
|
||||||
|
if len(box) > max_boxes:
|
||||||
|
box = box[:max_boxes]
|
||||||
|
box_data[: len(box)] = box
|
||||||
|
|
||||||
|
return image_data, box_data
|
||||||
@@ -0,0 +1,162 @@
|
|||||||
|
import sys, os
|
||||||
|
import argparse
|
||||||
|
from yolo import YOLO, detect_video
|
||||||
|
from PIL import Image
|
||||||
|
import readline
|
||||||
|
|
||||||
|
readline.parse_and_bind("tab: complete")
|
||||||
|
|
||||||
|
|
||||||
|
def detect_logo(yolo, img, save_img=True, save_img_path="output"):
|
||||||
|
try:
|
||||||
|
image = Image.open(img)
|
||||||
|
if image.mode != "RGB":
|
||||||
|
image = image.convert("RGB")
|
||||||
|
except:
|
||||||
|
print("File Open Error! Try again!")
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
prediction, r_image = yolo.detect_image(image)
|
||||||
|
|
||||||
|
if save_img:
|
||||||
|
r_image.save(os.path.join(save_img_path, os.path.basename(img)))
|
||||||
|
|
||||||
|
return prediction, r_image
|
||||||
|
|
||||||
|
|
||||||
|
FLAGS = None
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
# class YOLO defines the default value, so suppress any default here
|
||||||
|
parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
|
||||||
|
"""
|
||||||
|
Command line options
|
||||||
|
"""
|
||||||
|
parser.add_argument(
|
||||||
|
"--model",
|
||||||
|
type=str,
|
||||||
|
dest="model_path",
|
||||||
|
help="path to model weight file, default " + YOLO.get_defaults("model_path"),
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--anchors",
|
||||||
|
type=str,
|
||||||
|
dest="anchors_path",
|
||||||
|
help="path to anchor definitions, default " + YOLO.get_defaults("anchors_path"),
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--classes",
|
||||||
|
type=str,
|
||||||
|
dest="classes_path",
|
||||||
|
help="path to class definitions, default " + YOLO.get_defaults("classes_path"),
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--gpu_num",
|
||||||
|
type=int,
|
||||||
|
help="Number of GPU to use, default " + str(YOLO.get_defaults("gpu_num")),
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--image",
|
||||||
|
default=False,
|
||||||
|
action="store_true",
|
||||||
|
help="Image detection mode, will ignore all positional arguments",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--batch",
|
||||||
|
type=str,
|
||||||
|
help="Image detection mode for each file specified in input txt, will ignore all positional arguments",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--confidence",
|
||||||
|
type=float,
|
||||||
|
dest="score",
|
||||||
|
default=0.3,
|
||||||
|
help="Model confidence threshold above which to show predictions",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--iou_min",
|
||||||
|
type=float,
|
||||||
|
dest="iou_thr",
|
||||||
|
default=0.45,
|
||||||
|
help="IoU threshold for pruning object candidates with higher IoU than higher score boxes",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--input",
|
||||||
|
nargs="?",
|
||||||
|
type=str,
|
||||||
|
required=False,
|
||||||
|
default="input",
|
||||||
|
help="Image or video input path",
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--output",
|
||||||
|
nargs="?",
|
||||||
|
type=str,
|
||||||
|
default="output",
|
||||||
|
help="output path: either directory for single/batch image, or filename for video",
|
||||||
|
)
|
||||||
|
|
||||||
|
FLAGS = parser.parse_args()
|
||||||
|
|
||||||
|
if not os.path.isdir("output"):
|
||||||
|
os.makedirs("output")
|
||||||
|
|
||||||
|
if FLAGS.image:
|
||||||
|
"""
|
||||||
|
Image detection mode, either prompt user input or was passed as argument
|
||||||
|
"""
|
||||||
|
print("Image detection mode")
|
||||||
|
|
||||||
|
yolo = YOLO(**vars(FLAGS))
|
||||||
|
if FLAGS.input == "input":
|
||||||
|
while True:
|
||||||
|
FLAGS.input = input("Input image filename (q to quit):")
|
||||||
|
if FLAGS.input in ["q", "quit"]:
|
||||||
|
yolo.close_session()
|
||||||
|
exit()
|
||||||
|
|
||||||
|
img = FLAGS.input
|
||||||
|
prediction, r_image = detect_logo(
|
||||||
|
yolo, img, save_img=True, save_img_path=FLAGS.output
|
||||||
|
)
|
||||||
|
if prediction is None:
|
||||||
|
continue
|
||||||
|
|
||||||
|
else:
|
||||||
|
img = FLAGS.input
|
||||||
|
prediction, r_image = detect_logo(
|
||||||
|
yolo, img, save_img=True, save_img_path=FLAGS.output
|
||||||
|
)
|
||||||
|
|
||||||
|
yolo.close_session()
|
||||||
|
|
||||||
|
elif "batch" in FLAGS:
|
||||||
|
print("Batch image detection mode: reading " + FLAGS.batch)
|
||||||
|
|
||||||
|
with open(FLAGS.batch, "r") as file:
|
||||||
|
file_list = [line.split(" ")[0] for line in file.read().splitlines()]
|
||||||
|
out_txtfile = os.path.join(FLAGS.output, "data_pred.txt")
|
||||||
|
txtfile = open(out_txtfile, "w")
|
||||||
|
|
||||||
|
yolo = YOLO(**vars(FLAGS))
|
||||||
|
|
||||||
|
for img in file_list[:10]:
|
||||||
|
prediction, r_image = detect_logo(
|
||||||
|
yolo, img, save_img=True, save_img_path="output"
|
||||||
|
)
|
||||||
|
|
||||||
|
txtfile.write(img + " ")
|
||||||
|
for pred in prediction:
|
||||||
|
txtfile.write(",".join([str(p) for p in pred]) + " ")
|
||||||
|
txtfile.write("\n")
|
||||||
|
|
||||||
|
txtfile.close()
|
||||||
|
yolo.close_session()
|
||||||
|
elif "video" in FLAGS:
|
||||||
|
detect_video(YOLO(**vars(FLAGS)), FLAGS.video, FLAGS.output)
|
||||||
|
else:
|
||||||
|
print("Must specify at least --image or --video. See usage with --help.")
|
||||||
@@ -0,0 +1,182 @@
|
|||||||
|
[net]
|
||||||
|
# Testing
|
||||||
|
batch=1
|
||||||
|
subdivisions=1
|
||||||
|
# Training
|
||||||
|
# batch=64
|
||||||
|
# subdivisions=2
|
||||||
|
width=640
|
||||||
|
height=640
|
||||||
|
channels=3
|
||||||
|
momentum=0.9
|
||||||
|
decay=0.0005
|
||||||
|
angle=0
|
||||||
|
saturation = 1.5
|
||||||
|
exposure = 1.5
|
||||||
|
hue=.1
|
||||||
|
|
||||||
|
learning_rate=0.001
|
||||||
|
burn_in=1000
|
||||||
|
max_batches = 500200
|
||||||
|
policy=steps
|
||||||
|
steps=400000,450000
|
||||||
|
scales=.1,.1
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=16
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[maxpool]
|
||||||
|
size=2
|
||||||
|
stride=2
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=32
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[maxpool]
|
||||||
|
size=2
|
||||||
|
stride=2
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=64
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[maxpool]
|
||||||
|
size=2
|
||||||
|
stride=2
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[maxpool]
|
||||||
|
size=2
|
||||||
|
stride=2
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[maxpool]
|
||||||
|
size=2
|
||||||
|
stride=2
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[maxpool]
|
||||||
|
size=2
|
||||||
|
stride=1
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
###########
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=255
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
[yolo]
|
||||||
|
mask = 3,4,5
|
||||||
|
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
|
||||||
|
classes=80
|
||||||
|
num=6
|
||||||
|
jitter=.3
|
||||||
|
ignore_thresh = .7
|
||||||
|
truth_thresh = 1
|
||||||
|
random=1
|
||||||
|
|
||||||
|
[route]
|
||||||
|
layers = -4
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[upsample]
|
||||||
|
stride=2
|
||||||
|
|
||||||
|
[route]
|
||||||
|
layers = -1, 8
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=255
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[yolo]
|
||||||
|
mask = 1,2,3
|
||||||
|
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
|
||||||
|
classes=80
|
||||||
|
num=6
|
||||||
|
jitter=.3
|
||||||
|
ignore_thresh = .7
|
||||||
|
truth_thresh = 1
|
||||||
|
random=1
|
||||||
@@ -0,0 +1,789 @@
|
|||||||
|
[net]
|
||||||
|
# Testing
|
||||||
|
batch=1
|
||||||
|
subdivisions=1
|
||||||
|
# Training
|
||||||
|
# batch=64
|
||||||
|
# subdivisions=16
|
||||||
|
width=640
|
||||||
|
height=640
|
||||||
|
channels=3
|
||||||
|
momentum=0.9
|
||||||
|
decay=0.0005
|
||||||
|
angle=0
|
||||||
|
saturation = 1.5
|
||||||
|
exposure = 1.5
|
||||||
|
hue=.1
|
||||||
|
|
||||||
|
learning_rate=0.001
|
||||||
|
burn_in=1000
|
||||||
|
max_batches = 500200
|
||||||
|
policy=steps
|
||||||
|
steps=400000,450000
|
||||||
|
scales=.1,.1
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=32
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=64
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=32
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=64
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=64
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=64
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
# Downsample
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=2
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=1024
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[shortcut]
|
||||||
|
from=-3
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
######################
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=1024
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=1024
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=512
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=1024
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=255
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[yolo]
|
||||||
|
mask = 6,7,8
|
||||||
|
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
|
||||||
|
classes=80
|
||||||
|
num=9
|
||||||
|
jitter=.3
|
||||||
|
ignore_thresh = .5
|
||||||
|
truth_thresh = 1
|
||||||
|
random=1
|
||||||
|
|
||||||
|
|
||||||
|
[route]
|
||||||
|
layers = -4
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[upsample]
|
||||||
|
stride=2
|
||||||
|
|
||||||
|
[route]
|
||||||
|
layers = -1, 61
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=512
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=512
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=256
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=512
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=255
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[yolo]
|
||||||
|
mask = 3,4,5
|
||||||
|
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
|
||||||
|
classes=80
|
||||||
|
num=9
|
||||||
|
jitter=.3
|
||||||
|
ignore_thresh = .5
|
||||||
|
truth_thresh = 1
|
||||||
|
random=1
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
[route]
|
||||||
|
layers = -4
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[upsample]
|
||||||
|
stride=2
|
||||||
|
|
||||||
|
[route]
|
||||||
|
layers = -1, 36
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=256
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=256
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
filters=128
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
batch_normalize=1
|
||||||
|
size=3
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=256
|
||||||
|
activation=leaky
|
||||||
|
|
||||||
|
[convolutional]
|
||||||
|
size=1
|
||||||
|
stride=1
|
||||||
|
pad=1
|
||||||
|
filters=255
|
||||||
|
activation=linear
|
||||||
|
|
||||||
|
|
||||||
|
[yolo]
|
||||||
|
mask = 0,1,2
|
||||||
|
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
|
||||||
|
classes=80
|
||||||
|
num=9
|
||||||
|
jitter=.3
|
||||||
|
ignore_thresh = .5
|
||||||
|
truth_thresh = 1
|
||||||
|
random=1
|
||||||
|
|
||||||
@@ -0,0 +1,283 @@
|
|||||||
|
import os
|
||||||
|
import sys
|
||||||
|
|
||||||
|
|
||||||
|
def get_parent_dir(n=1):
|
||||||
|
""" returns the n-th parent dicrectory of the current
|
||||||
|
working directory """
|
||||||
|
current_path = os.path.dirname(os.path.abspath(__file__))
|
||||||
|
for k in range(n):
|
||||||
|
current_path = os.path.dirname(current_path)
|
||||||
|
return current_path
|
||||||
|
|
||||||
|
|
||||||
|
src_path = os.path.join(get_parent_dir(1), "2_Training", "src")
|
||||||
|
utils_path = os.path.join(get_parent_dir(1), "Utils")
|
||||||
|
|
||||||
|
sys.path.append(src_path)
|
||||||
|
sys.path.append(utils_path)
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
from keras_yolo3.yolo import YOLO, detect_video
|
||||||
|
from PIL import Image
|
||||||
|
from timeit import default_timer as timer
|
||||||
|
from utils import load_extractor_model, load_features, parse_input, detect_object
|
||||||
|
import test
|
||||||
|
import utils
|
||||||
|
import pandas as pd
|
||||||
|
import numpy as np
|
||||||
|
from Get_File_Paths import GetFileList
|
||||||
|
import random
|
||||||
|
|
||||||
|
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
|
||||||
|
|
||||||
|
# Set up folder names for default values
|
||||||
|
data_folder = os.path.join(get_parent_dir(n=1), "Data")
|
||||||
|
|
||||||
|
image_folder = os.path.join(data_folder, "Source_Images")
|
||||||
|
|
||||||
|
image_test_folder = os.path.join(image_folder, "Test_Images")
|
||||||
|
|
||||||
|
detection_results_folder = os.path.join(image_folder, "Test_Image_Detection_Results")
|
||||||
|
detection_results_file = os.path.join(detection_results_folder, "Detection_Results.csv")
|
||||||
|
|
||||||
|
model_folder = os.path.join(data_folder, "Model_Weights")
|
||||||
|
|
||||||
|
model_weights = os.path.join(model_folder, "trained_weights_final.h5")
|
||||||
|
model_classes = os.path.join(model_folder, "data_classes.txt")
|
||||||
|
|
||||||
|
anchors_path = os.path.join(src_path, "keras_yolo3", "model_data", "yolo_anchors.txt")
|
||||||
|
|
||||||
|
FLAGS = None
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
# Delete all default flags
|
||||||
|
parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
|
||||||
|
"""
|
||||||
|
Command line options
|
||||||
|
"""
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--input_path",
|
||||||
|
type=str,
|
||||||
|
default=image_test_folder,
|
||||||
|
help="Path to image/video directory. All subdirectories will be included. Default is "
|
||||||
|
+ image_test_folder,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--output",
|
||||||
|
type=str,
|
||||||
|
default=detection_results_folder,
|
||||||
|
help="Output path for detection results. Default is "
|
||||||
|
+ detection_results_folder,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--no_save_img",
|
||||||
|
default=False,
|
||||||
|
action="store_true",
|
||||||
|
help="Only save bounding box coordinates but do not save output images with annotated boxes. Default is False.",
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--file_types",
|
||||||
|
"--names-list",
|
||||||
|
nargs="*",
|
||||||
|
default=[],
|
||||||
|
help="Specify list of file types to include. Default is --file_types .jpg .jpeg .png .mp4",
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--yolo_model",
|
||||||
|
type=str,
|
||||||
|
dest="model_path",
|
||||||
|
default=model_weights,
|
||||||
|
help="Path to pre-trained weight files. Default is " + model_weights,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--anchors",
|
||||||
|
type=str,
|
||||||
|
dest="anchors_path",
|
||||||
|
default=anchors_path,
|
||||||
|
help="Path to YOLO anchors. Default is " + anchors_path,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--classes",
|
||||||
|
type=str,
|
||||||
|
dest="classes_path",
|
||||||
|
default=model_classes,
|
||||||
|
help="Path to YOLO class specifications. Default is " + model_classes,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--gpu_num", type=int, default=1, help="Number of GPU to use. Default is 1"
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--confidence",
|
||||||
|
type=float,
|
||||||
|
dest="score",
|
||||||
|
default=0.25,
|
||||||
|
help="Threshold for YOLO object confidence score to show predictions. Default is 0.25.",
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--box_file",
|
||||||
|
type=str,
|
||||||
|
dest="box",
|
||||||
|
default=detection_results_file,
|
||||||
|
help="File to save bounding box results to. Default is "
|
||||||
|
+ detection_results_file,
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--postfix",
|
||||||
|
type=str,
|
||||||
|
dest="postfix",
|
||||||
|
default="_catface",
|
||||||
|
help='Specify the postfix for images with bounding boxes. Default is "_catface"',
|
||||||
|
)
|
||||||
|
|
||||||
|
FLAGS = parser.parse_args()
|
||||||
|
|
||||||
|
save_img = not FLAGS.no_save_img
|
||||||
|
|
||||||
|
file_types = FLAGS.file_types
|
||||||
|
|
||||||
|
if file_types:
|
||||||
|
input_paths = GetFileList(FLAGS.input_path, endings=file_types)
|
||||||
|
else:
|
||||||
|
input_paths = GetFileList(FLAGS.input_path)
|
||||||
|
|
||||||
|
# Split images and videos
|
||||||
|
img_endings = (".jpg", ".jpg", ".png")
|
||||||
|
vid_endings = (".mp4", ".mpeg", ".mpg", ".avi")
|
||||||
|
|
||||||
|
input_image_paths = []
|
||||||
|
input_video_paths = []
|
||||||
|
for item in input_paths:
|
||||||
|
if item.endswith(img_endings):
|
||||||
|
input_image_paths.append(item)
|
||||||
|
elif item.endswith(vid_endings):
|
||||||
|
input_video_paths.append(item)
|
||||||
|
|
||||||
|
output_path = FLAGS.output
|
||||||
|
if not os.path.exists(output_path):
|
||||||
|
os.makedirs(output_path)
|
||||||
|
|
||||||
|
# define YOLO detector
|
||||||
|
yolo = YOLO(
|
||||||
|
**{
|
||||||
|
"model_path": FLAGS.model_path,
|
||||||
|
"anchors_path": FLAGS.anchors_path,
|
||||||
|
"classes_path": FLAGS.classes_path,
|
||||||
|
"score": FLAGS.score,
|
||||||
|
"gpu_num": FLAGS.gpu_num,
|
||||||
|
"model_image_size": (416, 416),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
# Make a dataframe for the prediction outputs
|
||||||
|
out_df = pd.DataFrame(
|
||||||
|
columns=[
|
||||||
|
"image",
|
||||||
|
"image_path",
|
||||||
|
"xmin",
|
||||||
|
"ymin",
|
||||||
|
"xmax",
|
||||||
|
"ymax",
|
||||||
|
"label",
|
||||||
|
"confidence",
|
||||||
|
"x_size",
|
||||||
|
"y_size",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
# labels to draw on images
|
||||||
|
class_file = open(FLAGS.classes_path, "r")
|
||||||
|
input_labels = [line.rstrip("\n") for line in class_file.readlines()]
|
||||||
|
print("Found {} input labels: {} ...".format(len(input_labels), input_labels))
|
||||||
|
|
||||||
|
if input_image_paths:
|
||||||
|
print(
|
||||||
|
"Found {} input images: {} ...".format(
|
||||||
|
len(input_image_paths),
|
||||||
|
[os.path.basename(f) for f in input_image_paths[:5]],
|
||||||
|
)
|
||||||
|
)
|
||||||
|
start = timer()
|
||||||
|
text_out = ""
|
||||||
|
|
||||||
|
# This is for images
|
||||||
|
for i, img_path in enumerate(input_image_paths):
|
||||||
|
print(img_path)
|
||||||
|
prediction, image = detect_object(
|
||||||
|
yolo,
|
||||||
|
img_path,
|
||||||
|
save_img=save_img,
|
||||||
|
save_img_path=FLAGS.output,
|
||||||
|
postfix=FLAGS.postfix,
|
||||||
|
)
|
||||||
|
y_size, x_size, _ = np.array(image).shape
|
||||||
|
for single_prediction in prediction:
|
||||||
|
out_df = out_df.append(
|
||||||
|
pd.DataFrame(
|
||||||
|
[
|
||||||
|
[
|
||||||
|
os.path.basename(img_path.rstrip("\n")),
|
||||||
|
img_path.rstrip("\n"),
|
||||||
|
]
|
||||||
|
+ single_prediction
|
||||||
|
+ [x_size, y_size]
|
||||||
|
],
|
||||||
|
columns=[
|
||||||
|
"image",
|
||||||
|
"image_path",
|
||||||
|
"xmin",
|
||||||
|
"ymin",
|
||||||
|
"xmax",
|
||||||
|
"ymax",
|
||||||
|
"label",
|
||||||
|
"confidence",
|
||||||
|
"x_size",
|
||||||
|
"y_size",
|
||||||
|
],
|
||||||
|
)
|
||||||
|
)
|
||||||
|
end = timer()
|
||||||
|
print(
|
||||||
|
"Processed {} images in {:.1f}sec - {:.1f}FPS".format(
|
||||||
|
len(input_image_paths),
|
||||||
|
end - start,
|
||||||
|
len(input_image_paths) / (end - start),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
out_df.to_csv(FLAGS.box, index=False)
|
||||||
|
|
||||||
|
# This is for videos
|
||||||
|
if input_video_paths:
|
||||||
|
print(
|
||||||
|
"Found {} input videos: {} ...".format(
|
||||||
|
len(input_video_paths),
|
||||||
|
[os.path.basename(f) for f in input_video_paths[:5]],
|
||||||
|
)
|
||||||
|
)
|
||||||
|
start = timer()
|
||||||
|
for i, vid_path in enumerate(input_video_paths):
|
||||||
|
output_path = os.path.join(
|
||||||
|
FLAGS.output,
|
||||||
|
os.path.basename(vid_path).replace(".", FLAGS.postfix + "."),
|
||||||
|
)
|
||||||
|
detect_video(yolo, vid_path, output_path=output_path)
|
||||||
|
|
||||||
|
end = timer()
|
||||||
|
print(
|
||||||
|
"Processed {} videos in {:.1f}sec".format(
|
||||||
|
len(input_video_paths), end - start
|
||||||
|
)
|
||||||
|
)
|
||||||
|
# Close the current yolo session
|
||||||
|
yolo.close_session()
|
||||||
@@ -0,0 +1,17 @@
|
|||||||
|
# TrainYourOwnYOLO: Inference
|
||||||
|
In this step, we test our detector on cat and dog images and videos located in [`TrainYourOwnYOLO/Data/Source_Images/Test_Images`](/Data/Source_Images/Test_Images). If you like to test the detector on your own images or videos, place them in the [`Test_Images`](/Data/Source_Images/Test_Images) folder.
|
||||||
|
|
||||||
|
## Testing Your Detector
|
||||||
|
To detect objects run the detector script from within the [`TrainYourOwnYOLO/3_Inference`](/3_Inference/) directory:.
|
||||||
|
```
|
||||||
|
python Detector.py
|
||||||
|
```
|
||||||
|
The outputs are saved to [`TrainYourOwnYOLO/Data/Source_Images/Test_Image_Detection_Results`](/Data/Source_Images/Test_Image_Detection_Results). The outputs include the original images with bounding boxes and confidence scores as well as a file called [`Detection_Results.csv`](/Data/Source_Images/Test_Image_Detection_Results/Detection_Results.csv) containing the image file paths and the bounding box coordinates. For videos, the output files are videos with bounding boxes and confidence scores. To list available command line options run `python Detector.py -h`.
|
||||||
|
|
||||||
|
### That's all!
|
||||||
|
Congratulations on building your own custom YOLOv3 computer vision application.
|
||||||
|
|
||||||
|
I hope you enjoyed this tutorial and I hope it helped you get our own computer vision project off the ground:
|
||||||
|
|
||||||
|
- Please **star** ⭐ this repo to get notifications on future improvements and
|
||||||
|
- Please **fork** 🍴 this repo if you like to use it as part of your own project.
|
||||||
@@ -0,0 +1,58 @@
|
|||||||
|
# Modified from https://stackoverflow.com/questions/25010369/wget-curl-large-file-from-google-drive
|
||||||
|
|
||||||
|
# To download houses weights run
|
||||||
|
# python Download_Weights.py 1aPCwYXFAOmklmNMLMh81Yduw5UrbHqkN Houses/trained_weights_final.h5
|
||||||
|
|
||||||
|
# To download openeings weights run
|
||||||
|
# python Download_Weights.py 1FbvHzQWCjucXPbTbI4S1MnBLkAi58Mxv Openings/trained_weights_final.h5
|
||||||
|
|
||||||
|
import requests
|
||||||
|
import os
|
||||||
|
import progressbar
|
||||||
|
|
||||||
|
|
||||||
|
def download_file_from_google_drive(id, destination):
|
||||||
|
def get_confirm_token(response):
|
||||||
|
for key, value in response.cookies.items():
|
||||||
|
if key.startswith("download_warning"):
|
||||||
|
return value
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def save_response_content(response, destination):
|
||||||
|
CHUNK_SIZE = 32768
|
||||||
|
|
||||||
|
with open(destination, "wb") as f:
|
||||||
|
bar = progressbar.ProgressBar(max_value=progressbar.UnknownLength)
|
||||||
|
i = 0
|
||||||
|
for chunk in response.iter_content(CHUNK_SIZE):
|
||||||
|
if chunk: # filter out keep-alive new chunks
|
||||||
|
bar.update(i)
|
||||||
|
i += 1
|
||||||
|
f.write(chunk)
|
||||||
|
|
||||||
|
URL = "https://docs.google.com/uc?export=download"
|
||||||
|
|
||||||
|
session = requests.Session()
|
||||||
|
|
||||||
|
response = session.get(URL, params={"id": id}, stream=True)
|
||||||
|
token = get_confirm_token(response)
|
||||||
|
|
||||||
|
if token:
|
||||||
|
params = {"id": id, "confirm": token}
|
||||||
|
response = session.get(URL, params=params, stream=True)
|
||||||
|
|
||||||
|
save_response_content(response, destination)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
import sys
|
||||||
|
|
||||||
|
if len(sys.argv) is not 3:
|
||||||
|
print("Usage: python google_drive.py drive_file_id destination_file_path")
|
||||||
|
else:
|
||||||
|
# TAKE ID FROM SHAREABLE LINK
|
||||||
|
file_id = sys.argv[1]
|
||||||
|
# DESTINATION FILE ON YOUR DISK
|
||||||
|
destination = os.path.join(os.getcwd(), sys.argv[2])
|
||||||
|
download_file_from_google_drive(file_id, destination)
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
python Download_Weights.py 1MGXAP_XD_w4OExPP10UHsejWrMww8Tu7 trained_weights_final.h5
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
Cat_Face
|
||||||
|
Depois Largura: | Altura: | Tamanho: 119 KiB |
|
Depois Largura: | Altura: | Tamanho: 36 KiB |
|
Depois Largura: | Altura: | Tamanho: 31 KiB |
|
Depois Largura: | Altura: | Tamanho: 34 KiB |
|
Depois Largura: | Altura: | Tamanho: 38 KiB |
|
Depois Largura: | Altura: | Tamanho: 44 KiB |
|
Depois Largura: | Altura: | Tamanho: 34 KiB |
|
Depois Largura: | Altura: | Tamanho: 20 KiB |
|
Depois Largura: | Altura: | Tamanho: 132 KiB |
|
Depois Largura: | Altura: | Tamanho: 83 KiB |
|
Depois Largura: | Altura: | Tamanho: 48 KiB |
|
Depois Largura: | Altura: | Tamanho: 156 KiB |
|
Depois Largura: | Altura: | Tamanho: 45 KiB |
|
Depois Largura: | Altura: | Tamanho: 428 KiB |
|
Depois Largura: | Altura: | Tamanho: 42 KiB |
|
Depois Largura: | Altura: | Tamanho: 15 KiB |
|
Depois Largura: | Altura: | Tamanho: 35 KiB |
|
Depois Largura: | Altura: | Tamanho: 10 KiB |
|
Depois Largura: | Altura: | Tamanho: 110 KiB |
|
Depois Largura: | Altura: | Tamanho: 65 KiB |
|
Depois Largura: | Altura: | Tamanho: 20 KiB |
|
Depois Largura: | Altura: | Tamanho: 45 KiB |
|
Depois Largura: | Altura: | Tamanho: 48 KiB |
|
Depois Largura: | Altura: | Tamanho: 101 KiB |
|
Depois Largura: | Altura: | Tamanho: 42 KiB |
|
Depois Largura: | Altura: | Tamanho: 36 KiB |
|
Depois Largura: | Altura: | Tamanho: 24 KiB |
|
Depois Largura: | Altura: | Tamanho: 192 KiB |
|
Depois Largura: | Altura: | Tamanho: 120 KiB |
|
Depois Largura: | Altura: | Tamanho: 71 KiB |
|
Depois Largura: | Altura: | Tamanho: 35 KiB |
|
Depois Largura: | Altura: | Tamanho: 20 KiB |
@@ -0,0 +1,88 @@
|
|||||||
|
image,image_path,xmin,ymin,xmax,ymax,label,confidence,x_size,y_size
|
||||||
|
0809_Cat_CNN.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\0809_Cat_CNN.jpg,258,29,574,315,0,0.7358033061027527,810,455
|
||||||
|
0_Cat_research.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\0_Cat_research.jpg,275,23,416,132,0,0.9388746023178101,615,409
|
||||||
|
0_PAY_GRUMPY_CAT.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\0_PAY_GRUMPY_CAT.jpg,114,15,491,318,0,0.8899086713790894,615,409
|
||||||
|
100.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\100.jpg,326,82,611,292,0,0.7353547215461731,750,600
|
||||||
|
10332358_3x2_700x467.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\10332358_3x2_700x467.jpg,167,0,557,399,0,0.9846770167350769,700,467
|
||||||
|
105488.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\105488.jpg,122,56,333,324,0,0.25304827094078064,550,550
|
||||||
|
105488.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\105488.jpg,37,17,423,380,0,0.9925405979156494,550,550
|
||||||
|
1451208.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\1451208.jpg,73,0,396,213,0,0.7688924670219421,480,274
|
||||||
|
14_20190709094206_8966619_large.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\14_20190709094206_8966619_large.jpg,357,55,1287,637,0,0.8452132940292358,1299,865
|
||||||
|
160825_vod_orig_historyofdogs_16x9_992.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\160825_vod_orig_historyofdogs_16x9_992.jpg,174,189,327,332,0,0.2502049505710602,992,558
|
||||||
|
160825_vod_orig_historyofdogs_16x9_992.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\160825_vod_orig_historyofdogs_16x9_992.jpg,475,149,585,358,0,0.26399022340774536,992,558
|
||||||
|
160825_vod_orig_historyofdogs_16x9_992.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\160825_vod_orig_historyofdogs_16x9_992.jpg,329,143,474,377,0,0.31680208444595337,992,558
|
||||||
|
160825_vod_orig_historyofdogs_16x9_992.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\160825_vod_orig_historyofdogs_16x9_992.jpg,21,176,311,349,0,0.568903923034668,992,558
|
||||||
|
17_cat_essentials.w700.h700.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\17_cat_essentials.w700.h700.jpg,62,258,276,450,0,0.9233705401420593,700,700
|
||||||
|
17_grumpy_cat_nymag_cover.w1200.h630.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\17_grumpy_cat_nymag_cover.w1200.h630.jpg,190,0,991,610,0,0.8651973009109497,1161,610
|
||||||
|
190207162239_01_fluffy_frozen_cat_trnd_exlarge_169.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\190207162239_01_fluffy_frozen_cat_trnd_exlarge_169.jpg,419,196,573,338,0,0.7172338366508484,780,438
|
||||||
|
1_3.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\1_3.jpg,798,238,1841,1206,0,0.9903472661972046,2620,2620
|
||||||
|
356950.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\356950.jpg,213,0,399,213,0,0.6213473081588745,780,520
|
||||||
|
420508170.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\420508170.jpg,144,0,250,100,0,0.3462313711643219,250,250
|
||||||
|
420508170.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\420508170.jpg,25,17,218,167,0,0.9814057350158691,250,250
|
||||||
|
49202627_303.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\49202627_303.jpg,380,6,668,296,0,0.9539755582809448,700,394
|
||||||
|
49a87dd8_035e_3f29_22f0_d762051e8f14.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\49a87dd8_035e_3f29_22f0_d762051e8f14.jpg,148,0,316,173,0,0.7422894835472107,350,200
|
||||||
|
4e66a4d0_7f81_4bda_882d_d7a18937f061_poster.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\4e66a4d0_7f81_4bda_882d_d7a18937f061_poster.jpg,0,0,1920,1080,0,0.965449333190918,1920,1080
|
||||||
|
5.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\5.jpg,0,10,227,226,0,0.9932052493095398,310,465
|
||||||
|
539814_istock_821264870.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\539814_istock_821264870.jpg,291,78,828,504,0,0.29192450642585754,1100,619
|
||||||
|
539814_istock_821264870.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\539814_istock_821264870.jpg,210,0,1045,619,0,0.9350296258926392,1100,619
|
||||||
|
5aa10ca0d877e618008b4678_750_562.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\5aa10ca0d877e618008b4678_750_562.jpg,155,0,691,562,0,0.9463245868682861,750,562
|
||||||
|
5d261a225562d.image.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\5d261a225562d.image.jpg,335,0,925,631,0,0.9926040172576904,960,763
|
||||||
|
633bd23689846b66b7891edea332a350.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\633bd23689846b66b7891edea332a350.jpg,114,0,387,479,0,0.6012320518493652,780,522
|
||||||
|
66185314_2297595197159603_5846688980096480369_n.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\66185314_2297595197159603_5846688980096480369_n.jpg,270,363,470,532,0,0.677074670791626,593,593
|
||||||
|
66185314_2297595197159603_5846688980096480369_n.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\66185314_2297595197159603_5846688980096480369_n.jpg,125,117,372,261,0,0.7377004623413086,593,593
|
||||||
|
67056618_500x406.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\67056618_500x406.jpg,165,20,412,282,0,0.8582413792610168,500,406
|
||||||
|
71aqYo6ZBpL._AC.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\71aqYo6ZBpL._AC.jpg,0,62,1148,961,0,0.39690640568733215,1200,1200
|
||||||
|
71aqYo6ZBpL._AC.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\71aqYo6ZBpL._AC.jpg,210,90,787,703,0,0.682583749294281,1200,1200
|
||||||
|
76DB73722C98408CB576F3C87486B768.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\76DB73722C98408CB576F3C87486B768.jpg,307,0,1016,570,0,0.9405931830406189,1280,720
|
||||||
|
961d1691_dfb88af9_1280w.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\961d1691_dfb88af9_1280w.jpg,208,0,874,639,0,0.8871027231216431,1280,707
|
||||||
|
Borowitz_BorisJohnsonDog.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Borowitz_BorisJohnsonDog.jpg,364,247,568,420,0,0.6505229473114014,727,485
|
||||||
|
Carolina_Dog.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Carolina_Dog.jpg,328,0,538,210,0,0.28266388177871704,600,400
|
||||||
|
Carolina_Dog.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Carolina_Dog.jpg,370,4,483,156,0,0.5948774218559265,600,400
|
||||||
|
china_cloned_cat_2_0.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\china_cloned_cat_2_0.jpg,581,56,848,342,0,0.7735435366630554,1024,683
|
||||||
|
chinese_company_that_cloned_a_pet_cat_now_want_to_copy_pets_t3h3.910.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\chinese_company_that_cloned_a_pet_cat_now_want_to_copy_pets_t3h3.910.jpg,116,54,499,339,0,0.4658162295818329,910,511
|
||||||
|
chinese_company_that_cloned_a_pet_cat_now_want_to_copy_pets_t3h3.910.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\chinese_company_that_cloned_a_pet_cat_now_want_to_copy_pets_t3h3.910.jpg,542,88,817,297,0,0.8265590071678162,910,511
|
||||||
|
chinese_company_that_cloned_a_pet_cat_now_want_to_copy_pets_t3h3.910.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\chinese_company_that_cloned_a_pet_cat_now_want_to_copy_pets_t3h3.910.jpg,187,120,431,287,0,0.9021586775779724,910,511
|
||||||
|
cute_puppy_body_image.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\cute_puppy_body_image.jpg,104,30,379,212,0,0.7816994786262512,600,408
|
||||||
|
dog_landing_hero_lg.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\dog_landing_hero_lg.jpg,0,0,722,447,0,0.9289771318435669,1080,447
|
||||||
|
dog_landing_hero_sm.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\dog_landing_hero_sm.jpg,322,0,519,412,0,0.28757601976394653,537,597
|
||||||
|
dog_landing_hero_sm.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\dog_landing_hero_sm.jpg,0,0,430,521,0,0.9773207902908325,537,597
|
||||||
|
e0dfdbdb_892b_4aad_84d6_03b5729899a1.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\e0dfdbdb_892b_4aad_84d6_03b5729899a1.jpg,61,101,183,265,0,0.3784005045890808,450,330
|
||||||
|
e0dfdbdb_892b_4aad_84d6_03b5729899a1.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\e0dfdbdb_892b_4aad_84d6_03b5729899a1.jpg,16,61,217,284,0,0.9891906380653381,450,330
|
||||||
|
Easter_Kittens_300x300.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Easter_Kittens_300x300.jpg,236,91,300,160,0,0.3217090964317322,300,300
|
||||||
|
Easter_Kittens_300x300.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Easter_Kittens_300x300.jpg,93,71,193,162,0,0.4590582847595215,300,300
|
||||||
|
Easter_Kittens_300x300.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Easter_Kittens_300x300.jpg,117,72,167,152,0,0.6096698641777039,300,300
|
||||||
|
Easter_Kittens_300x300.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Easter_Kittens_300x300.jpg,177,72,242,129,0,0.9193726181983948,300,300
|
||||||
|
Easter_Kittens_300x300.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Easter_Kittens_300x300.jpg,16,38,111,111,0,0.9278220534324646,300,300
|
||||||
|
ECgQI_aWsAEBZL_.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\ECgQI_aWsAEBZL_.jpg,268,21,905,558,0,0.9543792605400085,1024,768
|
||||||
|
Eloise1.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Eloise1.jpg,253,82,577,245,0,0.6300969123840332,960,640
|
||||||
|
Eloise1.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Eloise1.jpg,132,19,682,319,0,0.8653026819229126,960,640
|
||||||
|
Featured_Image_for_Cat.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Featured_Image_for_Cat.jpg,176,81,457,387,0,0.8383611440658569,600,600
|
||||||
|
feline_3333896_640.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\feline_3333896_640.jpg,40,0,497,406,0,0.9973390102386475,640,480
|
||||||
|
golden_retriever_puppy.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\golden_retriever_puppy.jpg,75,0,1020,734,0,0.3537549078464508,1100,734
|
||||||
|
golden_retriever_puppy.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\golden_retriever_puppy.jpg,218,98,852,594,0,0.8527566194534302,1100,734
|
||||||
|
husky_3380548__340.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\husky_3380548__340.jpg,75,0,288,274,0,0.9822015762329102,371,340
|
||||||
|
image.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\image.jpg,24,209,141,252,0,0.5600805878639221,400,339
|
||||||
|
image.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\image.jpg,74,110,205,162,0,0.6263542771339417,400,339
|
||||||
|
IMG_6115.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\IMG_6115.jpg,158,68,980,966,0,0.9327203631401062,1164,1063
|
||||||
|
KTYY2YRXUII6TA3V4POPNNUFLA.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\KTYY2YRXUII6TA3V4POPNNUFLA.jpg,221,0,1429,1153,0,0.9459948539733887,1484,1272
|
||||||
|
Lagotto_Romangolo_Tongue_Out.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Lagotto_Romangolo_Tongue_Out.jpg,200,0,728,487,0,0.9349266290664673,729,487
|
||||||
|
maxresdefault (1).jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\maxresdefault (1).jpg,307,120,1018,604,0,0.8544464707374573,1280,720
|
||||||
|
maxresdefault.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\maxresdefault.jpg,377,18,665,466,0,0.4301752746105194,1280,720
|
||||||
|
over_active_dog_211592482.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\over_active_dog_211592482.jpg,192,18,360,149,0,0.8399116396903992,590,428
|
||||||
|
pearl_16x9.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\pearl_16x9.jpg,74,0,360,242,0,0.8910990953445435,450,253
|
||||||
|
pets_4415649__340.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\pets_4415649__340.jpg,204,128,280,188,0,0.5803194046020508,513,340
|
||||||
|
photo_1510771463146_e89e6e86560e.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\photo_1510771463146_e89e6e86560e.jpg,297,907,1061,1419,0,0.45940396189689636,1362,2421
|
||||||
|
photo_1510771463146_e89e6e86560e.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\photo_1510771463146_e89e6e86560e.jpg,178,749,1198,1615,0,0.5203132629394531,1362,2421
|
||||||
|
pomeranian_68.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\pomeranian_68.jpg,217,126,417,204,0,0.3358507454395294,640,430
|
||||||
|
pomeranian_68.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\pomeranian_68.jpg,208,144,550,281,0,0.41130331158638,640,430
|
||||||
|
pug.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\pug.jpg,33,5,360,293,0,0.9813269376754761,400,300
|
||||||
|
puppy_1903313__340.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\puppy_1903313__340.jpg,181,64,418,217,0,0.43125417828559875,510,340
|
||||||
|
puppy_1903313__340.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\puppy_1903313__340.jpg,234,99,370,178,0,0.5488850474357605,510,340
|
||||||
|
puppy_dog.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\puppy_dog.jpg,166,100,369,196,0,0.4027504324913025,782,529
|
||||||
|
samoyed_puppy_dog_pictures.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\samoyed_puppy_dog_pictures.jpg,125,68,261,233,0,0.8838149309158325,600,400
|
||||||
|
Science_husky_1093906290.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Science_husky_1093906290.jpg,878,0,1289,1340,0,0.30920422077178955,2000,1500
|
||||||
|
Science_husky_1093906290.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\Science_husky_1093906290.jpg,579,0,1492,1275,0,0.9669162631034851,2000,1500
|
||||||
|
sei74187425.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\sei74187425.jpg,149,0,797,681,0,0.9778918623924255,968,681
|
||||||
|
single_minded_royalty_free_image_997141470_1558379890.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\single_minded_royalty_free_image_997141470_1558379890.jpg,50,5,467,268,0,0.5080410838127136,640,480
|
||||||
|
_107076735_shihtzu976.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\_107076735_shihtzu976.jpg,0,30,628,342,0,0.35563376545906067,660,371
|
||||||
|
_107076735_shihtzu976.jpg,C:\Users\Anton\OneDrive\GitHub\TrainYourOwnYOLO\Data\Source_Images\Test_Images\_107076735_shihtzu976.jpg,213,0,541,362,0,0.8478137254714966,660,371
|
||||||
|
|
Depois Largura: | Altura: | Tamanho: 135 KiB |
|
Depois Largura: | Altura: | Tamanho: 20 KiB |
|
Depois Largura: | Altura: | Tamanho: 55 KiB |
|
Depois Largura: | Altura: | Tamanho: 48 KiB |
|
Depois Largura: | Altura: | Tamanho: 107 KiB |
|
Depois Largura: | Altura: | Tamanho: 226 KiB |
|
Depois Largura: | Altura: | Tamanho: 38 KiB |
|
Depois Largura: | Altura: | Tamanho: 162 KiB |
|
Depois Largura: | Altura: | Tamanho: 49 KiB |
|
Depois Largura: | Altura: | Tamanho: 194 KiB |
|
Depois Largura: | Altura: | Tamanho: 83 KiB |
|
Depois Largura: | Altura: | Tamanho: 35 KiB |
|
Depois Largura: | Altura: | Tamanho: 36 KiB |
|
Depois Largura: | Altura: | Tamanho: 34 KiB |
|
Depois Largura: | Altura: | Tamanho: 21 KiB |
|
Depois Largura: | Altura: | Tamanho: 73 KiB |
|
Depois Largura: | Altura: | Tamanho: 55 KiB |
|
Depois Largura: | Altura: | Tamanho: 26 KiB |
|
Depois Largura: | Altura: | Tamanho: 36 KiB |
|
Depois Largura: | Altura: | Tamanho: 78 KiB |
|
Depois Largura: | Altura: | Tamanho: 97 KiB |
|
Depois Largura: | Altura: | Tamanho: 35 KiB |
|
Depois Largura: | Altura: | Tamanho: 21 KiB |