OCR

Published: 09 Oct 2015 Category: deep_learning

Papers

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

End-to-End Text Recognition with Convolutional Neural Networks

Word Spotting and Recognition with Embedded Attributes

Reading Text in the Wild with Convolutional Neural Networks

Deep structured output learning for unconstrained text recognition

  • intro: “propose an architecture consisting of a character sequence CNN and an N-gram encoding CNN which act on an input image in parallel and whose outputs are utilized along with a CRF model to recognize the text content present within the image.”
  • arxiv: http://arxiv.org/abs/1412.5903

Deep Features for Text Spotting

Reading Scene Text in Deep Convolutional Sequences

DeepFont: Identify Your Font from An Image

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

Writer-independent Feature Learning for Offline Signature Verification using Deep Convolutional Neural Networks

DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images

End-to-End Interpretation of the French Street Name Signs Dataset

End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance

Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading

Improving Text Proposals for Scene Images with Fully Convolutional Networks

  • intro: Universitat Autonoma de Barcelona (UAB) & University of Florence
  • intro: International Conference on Pattern Recognition (ICPR) - DLPR (Deep Learning for Pattern Recognition) workshop
  • arxiv: https://arxiv.org/abs/1702.05089

Scene Text Eraser

https://arxiv.org/abs/1705.02772

Attention-based Extraction of Structured Information from Street View Imagery

Implicit Language Model in LSTM for OCR

https://arxiv.org/abs/1805.09441

Scene Text Magnifier

Text Detection

Object Proposals for Text Extraction in the Wild

Text-Attentional Convolutional Neural Networks for Scene Text Detection

Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network

Synthetic Data for Text Localisation in Natural Images

Scene Text Detection via Holistic, Multi-Channel Prediction

Detecting Text in Natural Image with Connectionist Text Proposal Network

TextBoxes: A Fast Text Detector with a Single Deep Neural Network

TextBoxes++: A Single-Shot Oriented Scene Text Detector

Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

Detecting Oriented Text in Natural Images by Linking Segments

Deep Direct Regression for Multi-Oriented Scene Text Detection

Cascaded Segmentation-Detection Networks for Word-Level Text Spotting

https://arxiv.org/abs/1704.00834

Text-Detection-using-py-faster-rcnn-framework

WordFence: Text Detection in Natural Images with Border Awareness

SSD-text detection: Text Detector

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R-PHOC: Segmentation-Free Word Spotting using CNN

Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks

EAST: An Efficient and Accurate Scene Text Detector

Deep Scene Text Detection with Connected Component Proposals

Single Shot Text Detector with Regional Attention

Fused Text Segmentation Networks for Multi-oriented Scene Text Detection

https://arxiv.org/abs/1709.03272

Deep Residual Text Detection Network for Scene Text

  • intro: IAPR International Conference on Document Analysis and Recognition (ICDAR) 2017. Samsung R&D Institute of China, Beijing
  • arxiv: https://arxiv.org/abs/1711.04147

Feature Enhancement Network: A Refined Scene Text Detector

ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene

https://arxiv.org/abs/1711.11249

Detecting Curve Text in the Wild: New Dataset and New Solution

FOTS: Fast Oriented Text Spotting with a Unified Network

https://arxiv.org/abs/1801.01671

PixelLink: Detecting Scene Text via Instance Segmentation

PixelLink: Detecting Scene Text via Instance Segmentation

Sliding Line Point Regression for Shape Robust Scene Text Detection

https://arxiv.org/abs/1801.09969

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

Single Shot TextSpotter with Explicit Alignment and Attention

Rotation-Sensitive Regression for Oriented Scene Text Detection

Detecting Multi-Oriented Text with Corner-based Region Proposals

An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches

https://arxiv.org/abs/1804.09003

IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

Boosting up Scene Text Detectors with Guided CNN

https://arxiv.org/abs/1805.04132

Shape Robust Text Detection with Progressive Scale Expansion Network

A Single Shot Text Detector with Scale-adaptive Anchors

https://arxiv.org/abs/1807.01884

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping

TextContourNet: a Flexible and Effective Framework for Improving Scene Text Detection Architecture with a Multi-task Cascade

https://arxiv.org/abs/1809.03050

Correlation Propagation Networks for Scene Text Detection

https://arxiv.org/abs/1810.00304

Scene Text Detection with Supervised Pyramid Context Network

Improving Rotated Text Detection with Rotation Region Proposal Networks

https://arxiv.org/abs/1811.07031

Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks

https://arxiv.org/abs/1811.07432

Mask R-CNN with Pyramid Attention Network for Scene Text Detection

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection

Detecting Text in the Wild with Deep Character Embedding Network

MSR: Multi-Scale Shape Regression for Scene Text Detection

https://arxiv.org/abs/1901.02596

Pyramid Mask Text Detector

Shape Robust Text Detection with Progressive Scale Expansion Network

Tightness-aware Evaluation Protocol for Scene Text Detection

Character Region Awareness for Text Detection

Towards End-to-End Text Spotting in Natural Scenes

  • intro: An extension of the work “Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks”, Proc. Int. Conf. Comp. Vision 2017
  • arxiv: https://arxiv.org/abs/1906.06013

A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning

Geometry Normalization Networks for Accurate Scene Text Detection

Real-time Scene Text Detection with Differentiable Binarization

TextTubes for Detecting Curved Text in the Wild

https://arxiv.org/abs/1912.08990

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

DGST : Discriminator Guided Scene Text detector

https://arxiv.org/abs/2002.12509

MANGO: A Mask Attention Guided One-Stage Scene Text Spotter

Vision-Language Pre-Training for Boosting Scene Text Detectors

Text Recognition

Sequence to sequence learning for unconstrained scene text recognition

Drawing and Recognizing Chinese Characters with Recurrent Neural Network

Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition

Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition

Visual attention models for scene text recognition

https://arxiv.org/abs/1706.01487

Focusing Attention: Towards Accurate Text Recognition in Natural Images

Scene Text Recognition with Sliding Convolutional Character Models

https://arxiv.org/abs/1709.01727

AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition

https://arxiv.org/abs/1710.03425

A New Hybrid-parameter Recurrent Neural Networks for Online Handwritten Chinese Character Recognition

https://arxiv.org/abs/1711.02809

AON: Towards Arbitrarily-Oriented Text Recognition

Arbitrarily-Oriented Text Recognition

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition

https://arxiv.org/abs/1712.05404

Edit Probability for Scene Text Recognition

SCAN: Sliding Convolutional Attention Network for Scene Text Recognition

https://arxiv.org/abs/1806.00578

Adaptive Adversarial Attack on Scene Text Recognition

ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification

https://arxiv.org/abs/1812.05824

A Multi-Object Rectified Attention Network for Scene Text Recognition

SAFE: Scale Aware Feature Encoder for Scene Text Recognition

A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition

https://arxiv.org/abs/1904.01375

FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition

https://arxiv.org/abs/1904.09405

Towards Accurate Scene Text Recognition with Semantic Reasoning Networks

FedOCR: Communication-Efficient Federated Learning for Scene Text Recognition

Text Spotting & Text Detection + Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition

Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework

FOTS: Fast Oriented Text Spotting with a Unified Network

https://arxiv.org/abs/1801.01671

Single Shot TextSpotter with Explicit Alignment and Attention

An end-to-end TextSpotter with Explicit Alignment and Attention

Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes

Scene Text Detection and Recognition: The Deep Learning Era

A Novel Integrated Framework for Learning both Text Detection and Recognition

Efficient Video Scene Text Spotting: Unifying Detection, Tracking, and Recognition

A Multitask Network for Localization and Recognition of Text in Images

GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and Recognition

Convolutional Character Networks

RoadText-1K: Text Detection & Recognition Dataset for Driving Videos

SPTS: Single-Point Text Spotting

  • intro: Chinese University of Hong Kong & South China University of Technology & University of Adelaide & ByteDance Inc. & Huawei Technologies & Zhejiang University
  • arxiv: https://arxiv.org/abs/2112.07917

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer

SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

End-to-End Video Text Spotting with Transformer

Text Spotting Transformers

Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting

Breaking Captcha

Using deep learning to break a Captcha system

Breaking reddit captcha with 96% accuracy

I’m not a human: Breaking the Google reCAPTCHA

Neural Net CAPTCHA Cracker

Recurrent neural networks for decoding CAPTCHAS

Reading irctc captchas with 95% accuracy using deep learning

端到端的OCR:基于CNN的实现

I Am Robot: (Deep) Learning to Break Semantic Image CAPTCHAs

SimGAN-Captcha

Handwritten Recognition

High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps

Recognize your handwritten numbers

https://medium.com/@o.kroeger/recognize-your-handwritten-numbers-3f007cbe46ff#.jllz62xgu

Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras

MNIST Handwritten Digit Classifier

如何用卷积神经网络CNN识别手写数字集?

LeNet – Convolutional Neural Network in Python

Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention

MLPaint: the Real-Time Handwritten Digit Recognizer

Training a Computer to Recognize Your Handwriting

https://medium.com/@annalyzin/training-a-computer-to-recognize-your-handwriting-24b808fb584#.gd4pb9jk2

Using TensorFlow to create your own handwriting recognition engine

Building a Deep Handwritten Digits Classifier using Microsoft Cognitive Toolkit

Hand Writing Recognition Using Convolutional Neural Networks

Design of a Very Compact CNN Classifier for Online Handwritten Chinese Character Recognition Using DropWeight and Global Pooling

Handwritten digit string recognition by combination of residual network and RNN-CTC

https://arxiv.org/abs/1710.03112

Plate Recognition

Reading Car License Plates Using Deep Convolutional Neural Networks and LSTMs

Number plate recognition with Tensorflow

end-to-end-for-plate-recognition

Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN

  • intro: International Workshop on Advanced Image Technology, January, 8-10, 2017. Penang, Malaysia. Proceeding IWAIT2017
  • arxiv: https://arxiv.org/abs/1701.06439

License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks

Adversarial Generation of Training Examples for Vehicle License Plate Recognition

https://arxiv.org/abs/1707.03124

Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks

Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline

High Accuracy Chinese Plate Recognition Framework

LPRNet: License Plate Recognition via Deep Neural Networks

  • intrp=o: Intel IOTG Computer Vision Group
  • intro: works in real-time with recognition accuracy up to 95% for Chinese license plates: 3 ms/plate on nVIDIAR GeForceTMGTX 1080 and 1.3 ms/plate on IntelR CoreTMi7-6700K CPU.
  • arxiv: https://arxiv.org/abs/1806.10447

How many labeled license plates are needed?

An End-to-End Neural Network for Multi-line License Plate Recognition

Blogs

Applying OCR Technology for Receipt Recognition

Hacking MNIST in 30 lines of Python

Optical Character Recognition Using One-Shot Learning, RNN, and TensorFlow

https://blog.altoros.com/optical-character-recognition-using-one-shot-learning-rnn-and-tensorflow.html

Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning

https://blogs.dropbox.com/tech/2017/04/creating-a-modern-ocr-pipeline-using-computer-vision-and-deep-learning/

Projects

ocropy: Python-based tools for document analysis and OCR

Extracting text from an image using Ocropus

CLSTM : A small C++ implementation of LSTM networks, focused on OCR

OCR text recognition using tensorflow with attention

Digit Recognition via CNN: digital meter numbers detection

Attention-OCR: Visual Attention based OCR

umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm

Tesseract.js: Pure Javascript OCR for 62 Languages

DeepHCCR: Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel)

deep ocr: make a better chinese character recognition OCR than tesseract

https://github.com/JinpengLI/deep_ocr

Practical Deep OCR for scene text using CTPN + CRNN

https://github.com/AKSHAYUBHAT/DeepVideoAnalytics/blob/master/notebooks/OCR/readme.md

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

https://github.com//weinman/cnn_lstm_ctc_ocr

SSD_scene-text-detection

Videos

LSTMs for OCR

Resources

Deep Learning for OCR

https://github.com/hs105/Deep-Learning-for-OCR

Scene Text Localization & Recognition Resources

Scene Text Localization & Recognition Resources

awesome-ocr: A curated list of promising OCR resources

https://github.com/wanghaisheng/awesome-ocr