Ahmet Cagatay Seker1, 2 Sang Chul Ahn1, 2
1University of Science and Technology 2Korea Institute of Science and Technology
This study proposes a generalized framework for detecting and understanding product expiration dates. Figure 1 shows the overall architecture of the proposed framework. The proposed framework consists of three networks: Date detection, DMY detection, and Recognition networks, sequentially. Given an input image, the date detection network detects the dates and extracts their regions from the entire input image. Then, the DMY detection network identifies and extracts the day, month, and year components from the detected date regions. Later, the recognition network recognizes the characters of the day, month, and year components. Finally, the appropriate date is selected as an expiration date after the recognition of characters.
The proposed framework can handle challenging expiration date cases and distinguish 13 different date formats in Table 1. It can detect and understand the expiration dates even when the input image contains multiple dates. Figure 2 shows the results of the understanding expiration date. Moreover, the proposed framework can detect date, due mark, production mark, and code mark classes.
We've released
executable
files of the proposed framework for Windows and Ubuntu operating systems.
To see the detection and understanding results separately, we divided the proposed framework into
two executable files as run_detection.exe
for detecting date
and run_recognition.exe
for understanding the detected dates.
Anyone who would like to test their images can use the following instructions.
Step 1: Create a folder named images_det
that contains the test images.
Step 2: Place images_det
folder and all .exe
files in the same directory.
Step 3: Run the following commands for detecting the dates.
# for windows
cd path/to/exe
run_detection.exe
# for ubuntu
cd path/to/exe
./run_detection
After running the run_detection.exe
file,
a folder named results_det
containing the detected dates
and a folder named images_rec
containing the cropped regions of detected dates,
will be created in the same directory as executable files. The images in images_rec
folder are the
input images for the run_recognition.exe
file.
Noting that, the images_rec
folder also contains a cropped_img_list.json
file
that includes the list of cropped images for an input image for detection network.
If you would like to use the run_recognition.exe
file independent from the results of
run_detection.exe
, just remove the cropped_img_list.json
file before running it.
Step 4: Run the following commands for understanding the dates.
# for windows
cd path/to/exe
run_recognition.exe
# for ubuntu
cd path/to/exe
./run_recognition
After running the run_recognition.exe
file,
a folder named results_rec
will be created in the same directory as executable files.
It contains the day, month, and year detection results and the recognized expiration dates
with the meaning in a .txt
file.
We also provide an executable file in which you can use your webcam for detecting and understanding the expiration date in real-time. You can use the following instructions to use your webcam.
# for windows
cd path/to/exe
run_webcam.exe
# for ubuntu
cd path/to/exe
./run_webcam
In the lack of a publicly available dataset, we built six novel datasets to detect and understand the expiration dates. The collection of these datasets is named ExpDate, which consists of real and synthetic images of the product, date, and date components with various challenging cases.
For the date detection task, a novel expiration date dataset, Products-Real, was created
by capturing 1767 real-world images with near-horizontal dates
from food, beverage, and medicine products.
It is split into training and test sets consisting of 1102 and 665 images.
Additionally, around 12k product images with synthetic dates were generated
to obtain various and challenging date samples.
This synthetic dataset is called as Products-Synth and is used only for training.
To parse the dates into one of the date formats,
another novel dataset named Date-Synth was created
to train the DMY detection network, containing 128k images with a synthetic date.
A test dataset, Date-Real, was constructed for evaluation.
For the recognition task, a novel dataset, Components-Synth, was collected
by generating 450k training images with synthetic date components.
The components of the dates in Date-Real were used to create a new dataset named
Components-Real for the evaluation of the recognition task.
Figure 3 visualizes
some images from ExpDate dataset.
Eventually, ExpDate consists of six different datasets containing real and synthetic images.
It includes challenging cases with various date scales, different date fonts
(including the dot-matrix), reflection, complex backgrounds, etc.
Additionally, it contains images with multiple dates on the products (i.e., expiration and production dates).
Thirteen different expiration date formats in ExpDate are shown in Table
1.
#No | Date Format | Sample | #No | Date Format | Sample |
---|---|---|---|---|---|
1 | DDMMYY | 29 10 2023 | 8 | MMDD | 10 29 |
2 | DDMMMYY | 29 OCT 23 | 9 | MMYYYY | 10 2023 |
3 | DDMMYYYY | 29 10 2023 | 10 | MMMYYYY | OCT 2023 |
4 | DDMMMYYYY | 23 OCT 2023 | 11 | MMMDDYY | OCT 29 23 |
5 | YYYYMM | 2023 10 | 12 | MMMDDYYYY | OCT 29 2023 |
6 | YYYYMMDD | 2023 10 29 | 13 | YYYYMMMDD | 2023 OCT 29 |
7 | YYMMDD | 23 10 29 | - | - | - |
Each date in the Products-Real and Products-Synth datasets is annotated with class, bounding box coordinates, date transcription, image width, and height. There are four classes defined: date, due, prod, and code in the training sets. Expiration dates in the test set of Product-Real are specifically labeled as "exp" class for easy evaluation, unlike the training set of Product-Real. Each component in the Date-Real and Date-Synth datasets is annotated with class, bounding box, and transcription. The day, month, and year are used as the classes for each component of the dates. Moreover, Components-Real and Components-Synth datasets consist of the components of the day, month, and year and their transcriptions.
If you find this study useful for your research, please consider citing:
@article{seker2022generalized, title={A generalized framework for recognition of expiration dates on product packages using fully convolutional networks}, author={Seker, Ahmet Cagatay and Ahn, Sang Chul}, journal={Expert Systems with Applications}, pages={117310}, year={2022}, publisher={Elsevier} }