# Real-time-Text-Detection-DBNet **Repository Path**: gwmen_dl/Real-time-Text-Detection-DBNet ## Basic Information - **Project Name**: Real-time-Text-Detection-DBNet - **Description**: 文本位置检测 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-12-27 - **Last Updated**: 2025-02-05 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Real-time-Text-Detection PyTorch re-implementation of [Real-time Scene Text Detection with Differentiable Binarization](https://arxiv.org/abs/1911.08947) contour ### Difference between thesis and this implementation 1. Use dice loss instead of BCE(binary cross-entropy) loss. 2. Use normal convolution rather than deformable convolution in the backbone network. 3. The architecture of the backbone network is a simple FPN. 4. Have not implement OHEM. 5. The ground truth of the threshold map is constant 1 rather than 'the distance to the closest segment'. ### Introduction thanks to these project: - https://github.com/WenmuZhou/PAN.pytorch The features are summarized blow: + Use **resnet18/resnet50/shufflenetV2** as backbone. ### Contents 1. [Installation](#installation) 2. [Download](#download) 3. [Train](#train) 4. [Predict](#predict) 5. [Eval](#eval) 6. [Demo](#demo) ### Installation 1. pytorch 1.1.0 ### Download 1. ShuffleNet_V2 Models trained on ICDAR 2013+2015 (training set) https://pan.baidu.com/s/1Um0wzbTFjJC0jdJ703GR7Q or https://mega.nz/#!WdhxXAxT!oGURvmbQFqTHu5hljUPdbDMzI75_UO2iWLaXX5dJrDw ### Train 1. modify genText.py to generate txt list file for training/testing data 2. modify config.json 3. run ```python python train.py ``` ### Predict 1. run ```python python predict.py ``` ### Eval run ```python python eval.py ``` ### Examples contour bbox ### Todo - [ ] MobileNet backbone - [ ] Deformable convolution - [ ] tensorboard support - [ ] FPN --> Architecture in the thesis - [ ] Dice Loss --> BCE Loss - [ ] threshold map gt use 1 --> threshold map gt use distance (Use 1 will accelerate the label generation) - [ ] OHEM - [ ] OpenCV_DNN inference API for CPU machine - [ ] Caffe version (for deploying with MNN/NCNN) - [ ] ICDAR13 / ICDAR15 / CTW1500 / MLT2017 / Total-Text