# Real-time-Text-Detection-DBNet
**Repository Path**: gwmen_dl/Real-time-Text-Detection-DBNet
## Basic Information
- **Project Name**: Real-time-Text-Detection-DBNet
- **Description**: 文本位置检测
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2024-12-27
- **Last Updated**: 2025-02-05
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Real-time-Text-Detection
PyTorch re-implementation of [Real-time Scene Text Detection with Differentiable Binarization](https://arxiv.org/abs/1911.08947)
### Difference between thesis and this implementation
1. Use dice loss instead of BCE(binary cross-entropy) loss.
2. Use normal convolution rather than deformable convolution in the backbone network.
3. The architecture of the backbone network is a simple FPN.
4. Have not implement OHEM.
5. The ground truth of the threshold map is constant 1 rather than 'the distance to the closest segment'.
### Introduction
thanks to these project:
- https://github.com/WenmuZhou/PAN.pytorch
The features are summarized blow:
+ Use **resnet18/resnet50/shufflenetV2** as backbone.
### Contents
1. [Installation](#installation)
2. [Download](#download)
3. [Train](#train)
4. [Predict](#predict)
5. [Eval](#eval)
6. [Demo](#demo)
### Installation
1. pytorch 1.1.0
### Download
1. ShuffleNet_V2 Models trained on ICDAR 2013+2015 (training set)
https://pan.baidu.com/s/1Um0wzbTFjJC0jdJ703GR7Q
or https://mega.nz/#!WdhxXAxT!oGURvmbQFqTHu5hljUPdbDMzI75_UO2iWLaXX5dJrDw
### Train
1. modify genText.py to generate txt list file for training/testing data
2. modify config.json
3. run
```python
python train.py
```
### Predict
1. run
```python
python predict.py
```
### Eval
run
```python
python eval.py
```
### Examples
### Todo
- [ ] MobileNet backbone
- [ ] Deformable convolution
- [ ] tensorboard support
- [ ] FPN --> Architecture in the thesis
- [ ] Dice Loss --> BCE Loss
- [ ] threshold map gt use 1 --> threshold map gt use distance (Use 1 will accelerate the label generation)
- [ ] OHEM
- [ ] OpenCV_DNN inference API for CPU machine
- [ ] Caffe version (for deploying with MNN/NCNN)
- [ ] ICDAR13 / ICDAR15 / CTW1500 / MLT2017 / Total-Text