Object Detection

Published: 09 Oct 2015 Category: deep_learning
Method backbone test size VOC2007 VOC2010 VOC2012 ILSVRC 2013 MSCOCO 2015 Speed
OverFeat           24.3%    
R-CNN AlexNet   58.5% 53.7% 53.3% 31.4%    
R-CNN VGG16   66.0%          
SPP_net ZF-5   54.2%     31.84%    
DeepID-Net     64.1%     50.3%    
NoC 73.3%   68.8%          
Fast-RCNN VGG16   70.0% 68.8% 68.4%   19.7%(@[0.5-0.95]), 35.9%(@0.5)  
MR-CNN 78.2%   73.9%          
Faster-RCNN VGG16   78.8%   75.9%   21.9%(@[0.5-0.95]), 42.7%(@0.5) 198ms
Faster-RCNN ResNet101   85.6%   83.8%   37.4%(@[0.5-0.95]), 59.0%(@0.5)  
YOLO     63.4%   57.9%     45 fps
YOLO VGG-16     66.4%         21 fps
YOLOv2   448x448 78.6%   73.4%   21.6%(@[0.5-0.95]), 44.0%(@0.5) 40 fps
SSD VGG16 300x300 77.2%   75.8%   25.1%(@[0.5-0.95]), 43.1%(@0.5) 46 fps
SSD VGG16 512x512 79.8%   78.5%   28.8%(@[0.5-0.95]), 48.5%(@0.5) 19 fps
SSD ResNet101 300x300         28.0%(@[0.5-0.95]) 16 fps
SSD ResNet101 512x512         31.2%(@[0.5-0.95]) 8 fps
DSSD ResNet101 300x300         28.0%(@[0.5-0.95]) 8 fps
DSSD ResNet101 500x500         33.2%(@[0.5-0.95]) 6 fps
ION     79.2%   76.4%      
CRAFT     75.7%   71.3% 48.5%    
OHEM     78.9%   76.3%   25.5%(@[0.5-0.95]), 45.9%(@0.5)  
R-FCN ResNet50   77.4%         0.12sec(K40), 0.09sec(TitianX)
R-FCN ResNet101   79.5%         0.17sec(K40), 0.12sec(TitianX)
R-FCN(ms train) ResNet101   83.6%   82.0%   31.5%(@[0.5-0.95]), 53.2%(@0.5)  
PVANet 9.0     84.9%   84.2%     750ms(CPU), 46ms(TitianX)
RetinaNet ResNet101-FPN              
Light-Head R-CNN Xception* 800/1200         31.5%@[0.5:0.95] 95 fps
Light-Head R-CNN Xception* 700/1100         30.7%@[0.5:0.95] 102 fps


Deep Neural Networks for Object Detection

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

Scalable Object Detection using Deep Neural Networks

Scalable, High-Quality Object Detection

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

Object Detectors Emerge in Deep Scene CNNs

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

Object Detection Networks on Convolutional Feature Maps

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

DeepBox: Learning Objectness with Convolutional Networks

Object detection via a multi-region & semantic segmentation-aware CNN model

AttentionNet: Aggregating Weak Directions for Accurate Object Detection


DenseBox: Unifying Landmark Localization with End to End Object Detection

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

Adaptive Object Detection Using Adjacency and Zoom Prediction

G-CNN: an Iterative Grid Based Object Detector

We don’t need no bounding-boxes: Training object class detectors using only human verification

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

A MultiPath Network for Object Detection

CRAFT Objects from Images


Training Region-based Object Detectors with Online Hard Example Mining

S-OHEM: Stratified Online Hard Example Mining for Object Detection


Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers


R-FCN: Object Detection via Region-based Fully Convolutional Networks

R-FCN-3000 at 30fps: Decoupling Detection and Classification


Recycle deep features for better object detection

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

Multi-stage Object Detection with Group Recursive Learning

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection

PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

Gated Bi-directional CNN for Object Detection

Crafting GBD-Net for Object Detection

StuffNet: Using ‘Stuff’ to Improve Object Detection

Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

Hierarchical Object Detection with Deep Reinforcement Learning

Learning to detect and localize many objects from few examples

Speed/accuracy trade-offs for modern convolutional object detectors

SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving

Feature Pyramid Network (FPN)

Feature Pyramid Networks for Object Detection

Dynamic Feature Pyramid Networks for Object Detection

Implicit Feature Pyramid Network for Object Detection

You Should Look at All Objects

Action-Driven Object Detection with Top-Down Visual Attentions

Beyond Skip Connections: Top-Down Modulation for Object Detection

Wide-Residual-Inception Networks for Real-time Object Detection

Attentional Network for Visual Object Detection

Learning Chained Deep Features and Classifiers for Cascade in Object Detection

DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling

Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries

Spatial Memory for Context Reasoning in Object Detection

Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection


LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

Point Linking Network for Object Detection

Perceptual Generative Adversarial Networks for Small Object Detection


Few-shot Object Detection


Yes-Net: An effective Detector Based on Global Information


Towards lightweight convolutional neural networks for object detection


RON: Reverse Connection with Objectness Prior Networks for Object Detection

Deformable Part-based Fully Convolutional Network for Object Detection

Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors

Recurrent Scale Approximation for Object Detection in CNN

DSOD: Learning Deeply Supervised Object Detectors from Scratch

Object Detection from Scratch with Deep Supervision


CoupleNet: Coupling Global Structure with Local Parts for Object Detection

Incremental Learning of Object Detectors without Catastrophic Forgetting

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection


StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection


Dynamic Zoom-in Network for Fast Object Detection in Large Images


Zero-Annotation Object Detection with Web Knowledge Transfer

MegDet: A Large Mini-Batch Object Detector

Receptive Field Block Net for Accurate and Fast Object Detection

An Analysis of Scale Invariance in Object Detection - SNIP

Feature Selective Networks for Object Detection


Learning a Rotation Invariant Detector with Rotatable Bounding Box

Scalable Object Detection for Stylized Objects

Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids

Deep Regionlets for Object Detection

Training and Testing Object Detectors with Virtual Images

Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video

  • keywords: object mining, object tracking, unsupervised object discovery by appearance-based clustering, self-supervised detector adaptation
  • arxiv: https://arxiv.org/abs/1712.08832

Spot the Difference by Object Detection

Localization-Aware Active Learning for Object Detection

Object Detection with Mask-based Feature Encoding


LSTD: A Low-Shot Transfer Detector for Object Detection

Pseudo Mask Augmented Object Detection


Revisiting RCNN: On Awakening the Classification Power of Faster RCNN

Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection

Learning Region Features for Object Detection

Object Detection for Comics using Manga109 Annotations

Task-Driven Super Resolution: Object Detection in Low-resolution Images


Transferring Common-Sense Knowledge for Object Detection


Multi-scale Location-aware Kernel Representation for Object Detection

Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors

DetNet: A Backbone network for Object Detection

AdvDetPatch: Attacking Object Detectors with Adversarial Patches


Attacking Object Detectors via Imperceptible Patches on Background


Physical Adversarial Examples for Object Detectors

Object detection at 200 Frames Per Second

Object Detection using Domain Randomization and Generative Adversarial Refinement of Synthetic Images

SNIPER: Efficient Multi-Scale Training

Soft Sampling for Robust Object Detection


MetaAnchor: Learning to Detect Objects with Customized Anchors

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Pooling Pyramid Network for Object Detection

Modeling Visual Context is Key to Augmenting Object Detection Datasets

Acquisition of Localization Confidence for Accurate Object Detection

CornerNet: Detecting Objects as Paired Keypoints

Unsupervised Hard Example Mining from Videos for Improved Object Detection

SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection


A Survey of Modern Object Detection Literature using Deep Learning


Tiny-DSOD: Lightweight Object Detection for Resource-Restricted Usages

Deep Feature Pyramid Reconfiguration for Object Detection

MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection

Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks


Deep Learning for Generic Object Detection: A Survey


Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples

Fast and accurate object detection in high resolution 4K and 8K video using GPUs

  • intro: Best Paper Finalist at IEEE High Performance Extreme Computing Conference (HPEC) 2018
  • intro: Carnegie Mellon University
  • arxiv: https://arxiv.org/abs/1810.10551

Hybrid Knowledge Routed Modules for Large-scale Object Detection

BAN: Focusing on Boundary Context for Object Detection


R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy

DeRPN: Taking a further step toward more general object detection

Fast Efficient Object Detection Using Selective Attention


Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects


Efficient Coarse-to-Fine Non-Local Module for the Detection of Small Objects


Deep Regionlets: Blended Representation and Deep Learning for Generic Object Detection


Transferable Adversarial Attacks for Image and Video Object Detection


Anchor Box Optimization for Object Detection

AutoFocus: Efficient Multi-Scale Inference

Few-shot Object Detection via Feature Reweighting


Practical Adversarial Attack Against Object Detector


Scale-Aware Trident Networks for Object Detection

Region Proposal by Guided Anchoring

Bottom-up Object Detection by Grouping Extreme and Center Points

Bag of Freebies for Training Object Detection Neural Networks

Augmentation for small object detection


Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition

BayesOD: A Bayesian Approach for Uncertainty Estimation in Deep Object Detectors

DetNAS: Neural Architecture Search on Object Detection

ThunderNet: Towards Real-time Generic Object Detection


Feature Intertwiner for Object Detection

Improving Object Detection with Inverted Attention


What Object Should I Use? - Task Driven Object Detection

Towards Universal Object Detection by Domain Attention

Prime Sample Attention in Object Detection


BAOD: Budget-Aware Object Detection


An Analysis of Pre-Training on Object Detection

DuBox: No-Prior Box Objection Detection via Residual Dual Scale Detectors

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

Objects as Points

MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning using an Anchor Free Approach

CenterNet: Object Detection with Keypoint Triplets

CenterNet: Keypoint Triplets for Object Detection

CornerNet-Lite: Efficient Keypoint Based Object Detection

CenterNet++ for Object Detection

Automated Focal Loss for Image based Object Detection


Exploring Object Relation in Mean Teacher for Cross-Domain Detection

An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection

RepPoints: Point Set Representation for Object Detection

Dense RepPoints: Representing Visual Objects with Dense Point Sets

RepPoints V2: Verification Meets Regression for Object Detection

Object Detection in 20 Years: A Survey


Light-Weight RetinaNet for Object Detection


Learning Data Augmentation Strategies for Object Detection

Towards Adversarially Robust Object Detection

Multi-adversarial Faster-RCNN for Unrestricted Object Detection

Object as Distribution

Detecting 11K Classes: Large Scale Object Detection without Fine-Grained Bounding Boxes

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object

SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing

Relation Distillation Networks for Video Object Detection

Imbalance Problems in Object Detection: A Review

FreeAnchor: Learning to Match Anchors for Visual Object Detection

Efficient Neural Architecture Transformation Search in Channel-Level for Object Detection


Self-Training and Adversarial Background Regularization for Unsupervised Domain Adaptive One-Stage Object Detection

CBNet: A Novel Composite Backbone Network Architecture for Object Detection

CBNetV2: A Composite Backbone Network Architecture for Object Detection

A System-Level Solution for Low-Power Object Detection

Anchor Loss: Modulating Loss Scale based on Prediction Difficulty

Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression

Curriculum Self-Paced Learning for Cross-Domain Object Detection


Multiple Anchor Learning for Visual Object Detection


MnasFPN: Learning Latency-aware Pyramid Architecture for Object Detection on Mobile Devices

AugFPN: Improving Multi-scale Feature Learning for Object Detection

Object Detection as a Positive-Unlabeled Problem


Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN

BiDet: An Efficient Binarized Object Detector

Revisiting the Sibling Head in Object Detector

Extended Feature Pyramid Network for Small Object Detection


SaccadeNet: A Fast and Accurate Object Detector

Scale-Equalizing Pyramid Convolution for Object Detection

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

Robust Object Detection under Occlusion with Context-Aware CompositionalNets

DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution

Learning a Unified Sample Weighting Network for Object Detection

2nd Place Solution for Waymo Open Dataset Challenge – 2D Object Detection

Domain Adaptive Object Detection via Asymmetric Tri-way Faster-RCNN

AQD: Towards Accurate Quantized Object Detection

Probabilistic Anchor Assignment with IoU Prediction for Object Detection

BorderDet: Border Feature for Dense Object Detection

Quantum-soft QUBO Suppression for Accurate Object Detection

VarifocalNet: An IoU-aware Dense Object Detector

The 1st Tiny Object Detection Challenge:Methods and Results

  • intro: ECCV2020 Workshop on Real-world Computer Vision from Inputs with Limited Quality (RLQ) and Tiny Object Detection Challenge
  • arxiv: https://arxiv.org/abs/2009.07506

MimicDet: Bridging the Gap Between One-Stage and Two-Stage Object Detection

SEA: Bridging the Gap Between One- and Two-stage Detector Distillation via SEmantic-aware Alignment

A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection

Effective Fusion Factor in FPN for Tiny Object Detection

Bi-Dimensional Feature Alignment for Cross-Domain Object Detection

Rethinking Transformer-based Set Prediction for Object Detection

Unsupervised Object Detection with LiDAR Clues

Self-EMD: Self-Supervised Object Detection without ImageNet

End-to-End Object Detection with Fully Convolutional Network

Fine-Grained Dynamic Head for Object Detection

Focal and Efficient IOU Loss for Accurate Bounding Box Regression

Scale Normalized Image Pyramids with AutoFocus for Object Detection

DetCo: Unsupervised Contrastive Learning for Object Detection

RMOPP: Robust Multi-Objective Post-Processing for Effective Object Detection


Instance Localization for Self-supervised Detection Pretraining

Localization Distillation for Object Detection

General Instance Distillation for Object Detection

Towards Open World Object Detection

Data Augmentation for Object Detection via Differentiable Neural Rendering

Revisiting the Loss Weight Adjustment in Object Detection

You Only Look One-level Feature

Optimization for Oriented Object Detection via Representation Invariance Loss

Dynamic Anchor Learning for Arbitrary-Oriented Object Detection

Control Distance IoU and Control Distance IoU Loss Function for Better Bounding Box Regression


OTA: Optimal Transport Assignment for Object Detection

Distilling Object Detectors via Decoupled Features

Distilling a Powerful Student Model via Online Knowledge Distillation

IQDet: Instance-wise Quality Distribution Sampling for Object Detection

You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection

Augmenting Anchors by the Detector Itself


Rethinking Training from Scratch for Object Detection

Dynamic Head: Unifying Object Detection Heads with Attentions

Disentangle Your Dense Object Detector

Improving Object Detection by Label Assignment Distillation

Progressive Hard-case Mining across Pyramid Levels in Object Detection

Multi-Scale Aligned Distillation for Low-Resolution Detection

Pix2seq: A Language Modeling Framework for Object Detection

Mixed Supervised Object Detection by Transferring Mask Prior and Semantic Similarity

Bootstrap Your Object Detector via Mixed Training

PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices

Toward Minimal Misalignment at Minimal Cost in One-Stage and Anchor-Free Object Detection


GiraffeDet: A Heavy-Neck Paradigm for Object Detection


A Dual Weighting Label Assignment Scheme for Object Detection

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection

Two-Stage Object Detection


Rich feature hierarchies for accurate object detection and semantic segmentation

Fast R-CNN

Fast R-CNN

A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

Faster R-CNN

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

R-CNN minus R

Faster R-CNN in MXNet with distributed implementation and data parallelization

Contextual Priming and Feedback for Faster R-CNN

An Implementation of Faster RCNN with Study for Region Sampling

Interpretable R-CNN

Light-Head R-CNN: In Defense of Two-Stage Object Detector

Cascade R-CNN: Delving into High Quality Object Detection

Cascade R-CNN: High Quality Object Detection and Instance Segmentation

-arxiv: https://arxiv.org/abs/1906.09756

Cascade RPN: Delving into High-Quality Region Proposal Network with Adaptive Convolution

SMC Faster R-CNN: Toward a scene-specialized multi-object detector


Domain Adaptive Faster R-CNN for Object Detection in the Wild

Robust Physical Adversarial Attack on Faster R-CNN Object Detector


Auto-Context R-CNN

Grid R-CNN

Grid R-CNN Plus: Faster and Better

Few-shot Adaptive Faster R-CNN

Libra R-CNN: Towards Balanced Learning for Object Detection

Rethinking Classification and Localization in R-CNN

Reprojection R-CNN: A Fast and Accurate Object Detector for 360° Images

Rethinking Classification and Localization for Cascade R-CNN

IoU-uniform R-CNN: Breaking Through the Limitations of RPN

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training

Delving into the Imbalance of Positive Proposals in Two-stage Object Detection

Hierarchical Context Embedding for Region-based Object Detection

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Dynamic Sparse R-CNN

Featurized Query R-CNN

Augmenting Proposals by the Detector Itself

Probabilistic two-stage detection

Single-Shot Object Detection


You Only Look Once: Unified, Real-Time Object Detection

darkflow - translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++

Start Training YOLO with Our Own Data

YOLO: Core ML versus MPSNNGraph

TensorFlow YOLO object detection on Android

Computer Vision in iOS – Object Detection


YOLO9000: Better, Faster, Stronger


Yolo_mark: GUI for marking bounded boxes of objects in images for training Yolo v2

LightNet: Bringing pjreddie’s DarkNet out of the shadows


YOLO v2 Bounding Box Tool


YOLOv3: An Incremental Improvement

Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving


YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers


Spiking-YOLO: Spiking Neural Network for Real-time Object Detection


YOLO Nano: a Highly Compact You Only Look Once Convolutional Neural Network for Object Detection


REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs


Poly-YOLO: higher speed, more precise detection and instance segmentation for YOLOv3


YOLOv4: Optimal Speed and Accuracy of Object Detection

YOLOX: Exceeding YOLO Series in 2021

PP-YOLO: An Effective and Efficient Implementation of Object Detector


YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Real-time Object Detection for Streaming Perception


SSD: Single Shot MultiBox Detector

What’s the diffience in performance between this new code you pushed and the previous code? #327


DSSD : Deconvolutional Single Shot Detector

Enhancement of SSD by concatenating feature maps for object detection

Context-aware Single-Shot Detector

Feature-Fused SSD: Fast Detection for Small Objects


FSSD: Feature Fusion Single Shot Multibox Detector


Weaving Multi-scale Context for Single Shot Detector

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for Real-time Embedded Object Detection


MDSSD: Multi-scale Deconvolutional Single Shot Detector for small objects

Accurate Single Stage Detector Using Recurrent Rolling Convolution

Residual Features and Unified Prediction Network for Single Stage Detection



Focal Loss for Dense Object Detection

Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection

Focal Loss Dense Detector for Vehicle Surveillance


Single-Shot Refinement Neural Network for Object Detection

Single-Shot Bidirectional Pyramid Networks for High-Quality Object Detection

Dual Refinement Network for Single-Shot Object Detection


ScratchDet:Exploring to Train Single-Shot Object Detectors from Scratch

Gradient Harmonized Single-stage Detector

M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network

Multi-layer Pruning Framework for Compressing Single Shot MultiBox Detector

Consistent Optimization for Single-Shot Object Detection

A Single-shot Object Detector with Feature Aggragation and Enhancement


Towards Accurate One-Stage Object Detection with AP-Loss

  • intro: CVPR 2019
  • intro: Shanghai Jiao Tong University & Intel Labs & Malaysia Multimedia University & Tencent YouTu Lab & Peking University
  • keywords: Average-Precision loss (AP-loss)
  • arxiv: {https://arxiv.org/abs/1904.06373}(https://arxiv.org/abs/1904.06373)

AP-Loss for Accurate One-Stage Object Detection

Searching Parameterized AP Loss for Object Detection

Efficient Featurized Image Pyramid Network for Single Shot Detector

DR Loss: Improving Object Detection by Distributional Ranking

HAR-Net: Joint Learning of Hybrid Attention for Single-stage Object Detection


Propose-and-Attend Single Shot Detector


Revisiting Feature Alignment for One-stage Object Detection

IoU-balanced Loss Functions for Single-stage Object Detection

PosNeg-Balanced Anchors with Aligned Features for Single-Shot Object Detection

  • intro: Chinese Academy of Sciences & University of Chinese Academy of Sciences
  • keywords: Anchor Promotion Module (APM), Feature Alignment Module (FAM)
  • arxiv: https://arxiv.org/abs/1908.03295

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object


Hierarchical Shot Detector

Learning from Noisy Anchors for One-stage Object Detection


Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection

Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection

Single-Shot Two-Pronged Detector with Rectified IoU Loss

OneNet: Towards End-to-End One-Stage Object Detection

TOOD: Task-aligned One-stage Object Detection

Rethinking the Aligned and Misaligned Features in One-stage Object Detection



Feature Selective Anchor-Free Module for Single-Shot Object Detection

FCOS: Fully Convolutional One-Stage Object Detection

FoveaBox: Beyond Anchor-based Object Detector

IMMVP: An Efficient Daytime and Nighttime On-Road Object Detector


EfficientDet: Scalable and Efficient Object Detection

Domain Adaptation for Object Detection via Style Consistency

Soft Anchor-Point Object Detection

IPG-Net: Image Pyramid Guidance Network for Object Detection


Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection

Localization Uncertainty Estimation for Anchor-Free Object Detection

Corner Proposal Network for Anchor-free, Two-stage Object Detection

Dive Deeper Into Box for Object Detection

Corner Proposal Network for Anchor-free, Two-stage Object Detection

Reducing Label Noise in Anchor-Free Object Detection

Balance-Oriented Focal Loss with Linear Scheduling for Anchor Free Object Detection


PAFNet: An Efficient Anchor-Free Object Detector Guidance

Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection

  • intro: CVPR 2021 Workshop
  • intro: UIUC & MIT-IBM Watson AI Lab & IBM T.J. Watson Research Center & NVIDIA & University of Oregon & Picsart AI Research (PAIR)
  • arxiv: https://arxiv.org/abs/2104.14082

ObjectBox: From Centers to Boxes for Anchor-Free Object Detection


End-to-End Object Detection with Transformers

Deformable DETR: Deformable Transformers for End-to-End Object Detection

RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Conditional DETR for Fast Training Convergence

End-to-End Object Detection with Adaptive Clustering Transformer

Toward Transformer-Based Object Detection

Efficient DETR: Improving End-to-End Object Detector with Dense Prior

Anchor DETR: Query Design for Transformer-Based Detector

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

Oriented Object Detection with Transformer

ViDT: An Efficient and Effective Fully Transformer-based Object Detector

An Extendable, Efficient and Effective Transformer-based Object Detector

Omni-DETR: Omni-Supervised Object Detection with Transformers

Accelerating DETR Convergence via Semantic-Aligned Matching

AdaMixer: A Fast-Converging Query-Based Object Detector

Exploring Plain Vision Transformer Backbones for Object Detection

Efficient Decoder-free Object Detection with Transformers

Non-Maximum Suppression (NMS)

End-to-End Integration of a Convolutional Network, Deformable Parts Model and Non-Maximum Suppression

A convnet for non-maximum suppression

Improving Object Detection With One Line of Code

Soft-NMS – Improving Object Detection With One Line of Code

Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection

Learning non-maximum suppression

Relation Networks for Object Detection

Learning Pairwise Relationship for Multi-object Detection in Crowded Scenes

Daedalus: Breaking Non-Maximum Suppression in Object Detection via Adversarial Examples


NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing

Hashing-based Non-Maximum Suppression for Crowded Object Detection

Visibility Guided NMS: Efficient Boosting of Amodal Object Detection in Crowded Traffic Scenes

  • intro: NeurIPS 2019, Machine Learning for Autonomous Driving Workshop
  • intro: Mercedes-Benz AG, R&D & University of Jena
  • keywords: Visibility Guided NMS (vg-NMS)
  • arxiv: https://arxiv.org/abs/2006.08547

Determinantal Point Process as an alternative to NMS


Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding


Object Detection Made Simpler by Eliminating Heuristic NMS

Adversarial Examples

Adversarial Examples that Fool Detectors

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Knowledge Distillation

Mimicking Very Efficient Network for Object Detection

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Learning Efficient Detector with Semi-supervised Adaptive Distillation

Distilling Object Detectors with Fine-grained Feature Imitation

GAN-Knowledge Distillation for one-stage Object Detection


Learning Lightweight Pedestrian Detector with Hierarchical Knowledge Distillation

Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors

G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation

LGD: Label-guided Self-distillation for Object Detection

Deep Structured Instance Graph for Distilling Object Detectors

Instance-Conditional Knowledge Distillation for Object Detection

Distilling Object Detectors with Feature Richness

  • intro: University of Science and Technology of China & CAS & Cambricon Technologies & University of Chinese Academy of Sciences
  • arxiv: https://arxiv.org/abs/2111.00674

Focal and Global Knowledge Distillation for Detectors

Prediction-Guided Distillation for Dense Object Detection

Rotated Object Detection

Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss

Long-Tailed Object Detection

Factors in Finetuning Deep Model for object detection

Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax

Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection

A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection

  • intro: The Ohio State University & University of Central Florida & University of Southern California & Google Research
  • arxiv: https://arxiv.org/abs/2102.08884

Adaptive Class Suppression Loss for Long-Tail Object Detection

Weakly Supervised Object Detection

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection

Weakly supervised object detection using pseudo-strong labels

Saliency Guided End-to-End Learning for Weakly Supervised Object Detection

Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection

Video Object Detection

Learning Object Class Detectors from Weakly Annotated Video

Analysing domain shift factors between videos and images for object detection

Video Object Recognition

Deep Learning for Saliency Prediction in Natural Video

T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos

Object Detection from Video Tubelets with Convolutional Neural Networks

Object Detection in Videos with Tubelets and Multi-context Cues

Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

CNN Based Object Detection in Large Video Images

Object Detection in Videos with Tubelet Proposal Networks

Flow-Guided Feature Aggregation for Video Object Detection

Video Object Detection using Faster R-CNN

Improving Context Modeling for Video Object Detection and Tracking


Temporal Dynamic Graph LSTM for Action-driven Video Object Detection

Mobile Video Object Detection with Temporally-Aware Feature Maps


Towards High Performance Video Object Detection


Impression Network for Video Object Detection


Spatial-Temporal Memory Networks for Video Object Detection


3D-DETNet: a Single Stage Video-Based Vehicle Detector


Object Detection in Videos by Short and Long Range Object Linking


Object Detection in Video with Spatiotemporal Sampling Networks

Towards High Performance Video Object Detection for Mobiles

Optimizing Video Object Detection via a Scale-Time Lattice

Pack and Detect: Fast Object Detection in Videos Using Region-of-Interest Packing


Fast Object Detection in Compressed Video


Tube-CNN: Modeling temporal evolution of appearance for object detection in video

AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling

SCNN: A General Distribution based Statistical Convolutional Neural Network with Application to Video Object Detection

Looking Fast and Slow: Memory-Guided Mobile Video Object Detection

Progressive Sparse Local Attention for Video object detection

Sequence Level Semantics Aggregation for Video Object Detection

Object Detection in Video with Spatial-temporal Context Aggregation

A Delay Metric for Video Object Detection: What Average Precision Fails to Tell

Minimum Delay Object Detection From Video

Learning Motion Priors for Efficient Video Object Detection


Object-aware Feature Aggregation for Video Object Detection

End-to-End Video Object Detection with Spatial-Temporal Transformers

Object Detection on Mobile Devices

Pelee: A Real-Time Object Detection System on Mobile Devices

Object Detection on RGB-D

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

Differential Geometry Boosts Convolutional Neural Networks for Object Detection

A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation


Cross-Modal Attentional Context Learning for RGB-D Object Detection

Zero-Shot Object Detection

Zero-Shot Detection

Zero-Shot Object Detection


Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts

Zero-Shot Object Detection by Hybrid Region Embedding

Visual Relationship Detection

Visual Relationship Detection with Language Priors

ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection

Visual Translation Embedding Network for Visual Relation Detection

Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection

Detecting Visual Relationships with Deep Relational Networks

Identifying Spatial Relations in Images using Convolutional Neural Networks


PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN

Natural Language Guided Visual Relationship Detection


Detecting Visual Relationships Using Box Attention

Google AI Open Images - Visual Relationship Track

Context-Dependent Diffusion Network for Visual Relationship Detection

A Problem Reduction Approach for Visual Relationships Detection

Exploring the Semantics for Visual Relationship Detection


Face Detection

Multi-view Face Detection Using Deep Convolutional Neural Networks

From Facial Parts Responses to Face Detection: A Deep Learning Approach

Compact Convolutional Neural Network Cascade for Face Detection

Face Detection with End-to-End Integration of a ConvNet and a 3D Model

CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection

Towards a Deep Learning Framework for Unconstrained Face Detection

Supervised Transformer Network for Efficient Face Detection

UnitBox: An Advanced Object Detection Network

Bootstrapping Face Detection with Hard Negative Examples

Grid Loss: Detecting Occluded Faces

A Multi-Scale Cascade Fully Convolutional Network Face Detector


Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

Face Detection using Deep Learning: An Improved Faster RCNN Approach

Faceness-Net: Face Detection through Deep Facial Part Responses

Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained “Hard Faces”

End-To-End Face Detection and Recognition


Face R-CNN


Face Detection through Scale-Friendly Deep Convolutional Networks


Scale-Aware Face Detection

Detecting Faces Using Inside Cascaded Contextual CNN

Multi-Branch Fully Convolutional Network for Face Detection


SSH: Single Stage Headless Face Detector

Dockerface: an easy to install and use Faster R-CNN face detector in a Docker container


FaceBoxes: A CPU Real-time Face Detector with High Accuracy

S3FD: Single Shot Scale-invariant Face Detector

Detecting Faces Using Region-based Fully Convolutional Networks


AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection


Face Attention Network: An effective Face Detector for the Occluded Faces


Feature Agglomeration Networks for Single Stage Face Detection


Face Detection Using Improved Faster RCNN

PyramidBox: A Context-assisted Single Shot Face Detector

PyramidBox++: High Performance Detector for Finding Tiny Face

A Fast Face Detection Method via Convolutional Neural Network

Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy

Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks

SFace: An Efficient Network for Face Detection in Large Scale Variations

Survey of Face Detection on Low-quality Images


Anchor Cascade for Efficient Face Detection

Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization

Selective Refinement Network for High Performance Face Detection


DSFD: Dual Shot Face Detector


Learning Better Features for Face Detection with Feature Fusion and Segmentation Supervision


FA-RPN: Floating Region Proposals for Face Detection


Robust and High Performance Face Detector


DAFE-FD: Density Aware Feature Enrichment for Face Detection


Improved Selective Refinement Network for Face Detection

Revisiting a single-stage method for face detection


MSFD:Multi-Scale Receptive Field Face Detector

LFFD: A Light and Fast Face Detector for Edge Devices

RetinaFace: Single-stage Dense Face Localisation in the Wild

BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

HAMBox: Delving into Online High-quality Anchors Mining for Detecting Outer Faces

KPNet: Towards Minimal Face Detector

ASFD: Automatic and Scalable Face Detector

TinaFace: Strong but Simple Baseline for Face Detection

MogFace: Rethinking Scale Augmentation on the Face Detector

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection

1st Place Solutions for UG2+ Challenge 2021 – (Semi-)supervised Face detection in the low light condition

MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation

Detect Small Faces

Finding Tiny Faces

Detecting and counting tiny faces

Seeing Small Faces from Robust Anchor’s Perspective

Face-MagNet: Magnifying Feature Maps to Detect Small Faces

Robust Face Detection via Learning Small Faces on Hard Images

SFA: Small Faces Attention Face Detector

Person Head Detection

Context-aware CNNs for person head detection

Detecting Heads using Feature Refine Net and Cascaded Multi-scale Architecture


A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications


FCHD: A fast and accurate head detector

Relational Learning for Joint Head and Human Detection

Body-Face Joint Detection via Embedding and Head Hook

Pedestrian Detection / People Detection

Pedestrian Detection aided by Deep Learning Semantic Tasks

Deep Learning Strong Parts for Pedestrian Detection

Taking a Deeper Look at Pedestrians

Convolutional Channel Features

End-to-end people detection in crowded scenes

Learning Complexity-Aware Cascades for Deep Pedestrian Detection

Deep convolutional neural networks for pedestrian detection

Scale-aware Fast R-CNN for Pedestrian Detection

New algorithm improves speed and accuracy of pedestrian detection

Pushing the Limits of Deep CNNs for Pedestrian Detection

  • intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”
  • arxiv: http://arxiv.org/abs/1603.04525

A Real-Time Deep Learning Pedestrian Detector for Robot Navigation

A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation

Is Faster R-CNN Doing Well for Pedestrian Detection?

Unsupervised Deep Domain Adaptation for Pedestrian Detection

Reduced Memory Region Based Deep Convolutional Neural Network Detection

Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection

Detecting People in Artwork with CNNs

Deep Multi-camera People Detection

Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters

What Can Help Pedestrian Detection?

Illuminating Pedestrians via Simultaneous Detection & Segmentation


Rotational Rectification Network for Robust Pedestrian Detection

STD-PD: Generating Synthetic Training Data for Pedestrian Detection in Unannotated Videos

Too Far to See? Not Really! — Pedestrian Detection with Scale-aware Localization Policy


Aggregated Channels Network for Real-Time Pedestrian Detection


Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection


Pedestrian-Synthesis-GAN: Generating Pedestrian Data in Real Scene and Beyond


PCN: Part and Context Information for Pedestrian Detection with CNNs

Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors

Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation

Bi-box Regression for Pedestrian Detection and Occlusion Estimation

Pedestrian Detection with Autoregressive Network Phases

SSA-CNN: Semantic Self-Attention CNN for Pedestrian Detection


High-level Semantic Feature Detection:A New Perspective for Pedestrian Detection

Center and Scale Prediction: A Box-free Approach for Object Detection

Evading Real-Time Person Detectors by Adversarial T-shirt


Coupled Network for Robust Pedestrian Detection with Gated Multi-Layer Feature Extraction and Deformable Occlusion Handling


Scale Match for Tiny Person Detection

SM+: Refined Scale Match for Tiny Person Detection


Resisting the Distracting-factors in Pedestrian Detection

SADet: Learning An Efficient and Accurate Pedestrian Detector


NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination

Anchor-free Small-scale Multispectral Pedestrian Detection

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection

DETR for Pedestrian Detection


V2F-Net: Explicit Decomposition of Occluded Pedestrian Detection

Pedestrian Detection in a Crowd

Repulsion Loss: Detecting Pedestrians in a Crowd

Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd

Adaptive NMS: Refining Pedestrian Detection in a Crowd

PedHunter: Occlusion Robust Pedestrian Detector in Crowded Scenes

Double Anchor R-CNN for Human Detection in a Crowd

CSID: Center, Scale, Identity and Density-aware Pedestrian Detection in a Crowd


Semantic Head Enhanced Pedestrian Detection in a Crowd


Detection in Crowded Scenes: One Proposal, Multiple Predictions

Visible Feature Guidance for Crowd Pedestrian Detection

Occluded Pedestrian Detection

Mask-Guided Attention Network for Occluded Pedestrian Detection

Multispectral Pedestrian Detection

Multispectral Deep Neural Networks for Pedestrian Detection

Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection

Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation

The Cross-Modality Disparity Problem in Multispectral Pedestrian Detection


Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection


GFD-SSD: Gated Fusion Double SSD for Multispectral Pedestrian Detection


Unsupervised Domain Adaptation for Multispectral Pedestrian Detection


Vehicle Detection

DAVE: A Unified Framework for Fast Vehicle Detection and Annotation

Evolving Boxes for fast Vehicle Detection

Fine-Grained Car Detection for Visual Census Estimation

SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection

Label and Sample: Efficient Training of Vehicle Object Detector from Sparsely Labeled Data

Domain Randomization for Scene-Specific Car Detection and Pose Estimation


ShuffleDet: Real-Time Vehicle Detection Network in On-board Embedded UAV Imagery

Traffic-Sign Detection

Traffic-Sign Detection and Classification in the Wild

Evaluating State-of-the-art Object Detector on Challenging Traffic Light Data

Detecting Small Signs from Large Images

Localized Traffic Sign Detection with Multi-scale Deconvolution Networks


Detecting Traffic Lights by Single Shot Detection

A Hierarchical Deep Architecture and Mini-Batch Selection Method For Joint Traffic Sign and Light Detection

Skeleton Detection

Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs

DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images

SRN: Side-output Residual Network for Object Symmetry Detection in the Wild

Hi-Fi: Hierarchical Feature Integration for Skeleton Detection


Fruit Detection

Deep Fruit Detection in Orchards

Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards

Shadow Detection

Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network


A+D-Net: Shadow Detection with Adversarial Shadow Attenuation


Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal


Direction-aware Spatial Context Features for Shadow Detection

Direction-aware Spatial Context Features for Shadow Detection and Removal

Others Detection

Deep Deformation Network for Object Landmark Localization

Fashion Landmark Detection in the Wild

Deep Learning for Fast and Accurate Fashion Item Detection

OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)

Selfie Detection by Synergy-Constraint Based Convolutional Neural Network

Associative Embedding:End-to-End Learning for Joint Detection and Grouping

Deep Cuboid Detection: Beyond 2D Bounding Boxes

Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection

Deep Learning Logo Detection with Data Expansion by Synthesising Context

Scalable Deep Learning Logo Detection


Pixel-wise Ear Detection with Convolutional Encoder-Decoder Networks

Automatic Handgun Detection Alarm in Videos Using Deep Learning

Objects as context for part detection


Using Deep Networks for Drone Detection

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

Target Driven Instance Detection


DeepVoting: An Explainable Framework for Semantic Part Detection under Partial Occlusion


VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition

Grab, Pay and Eat: Semantic Food Detection for Smart Restaurants


ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos

Deep Learning Object Detection Methods for Ecological Camera Trap Data

EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection


Towards End-to-End Lane Detection: an Instance Segmentation Approach

Densely Supervised Grasp Detector (DSGD)


Object Proposal

DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

Scale-aware Pixel-wise Object Proposal Networks

Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization

Learning to Segment Object Proposals via Recursive Neural Networks

Learning Detection with Diverse Proposals

  • intro: CVPR 2017
  • keywords: differentiable Determinantal Point Process (DPP) layer, Learning Detection with Diverse Proposals (LDDP)
  • arxiv: https://arxiv.org/abs/1704.03533

ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond

Improving Small Object Proposals for Company Logo Detection

Open Logo Detection Challenge

AttentionMask: Attentive, Efficient Object Proposal Generation Focusing on Small Objects


Beyond Bounding Boxes: Precise Localization of Objects in Images

Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

Weakly Supervised Object Localization Using Size Estimates

Active Object Localization with Deep Reinforcement Learning

Localizing objects using referring expressions

LocNet: Improving Localization Accuracy for Object Detection

Learning Deep Features for Discriminative Localization

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

Ensemble of Part Detectors for Simultaneous Classification and Localization


STNet: Selective Tuning of Convolutional Networks for Object Localization


Soft Proposal Networks for Weakly Supervised Object Localization

Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN

Tutorials / Talks

Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection

Towards Good Practices for Recognition & Detection

Work in progress: Improving object detection and instance segmentation for small objects


Object Detection with Deep Learning: A Review






SimpleDet - A Simple and Versatile Framework for Object Detection and Instance Recognition


TensorBox: a simple framework for training neural networks to detect objects in images


Object detection in torch: Implementation of some object detection frameworks in torch

Using DIGITS to train an Object Detection network

FCN-MultiBox Detector

KittiBox: A car detection model implemented in Tensorflow.

Deformable Convolutional Networks + MST + Soft-NMS

How to Build a Real-time Hand-Detector using Neural Networks (SSD) on Tensorflow

Metrics for object detection



Detection Results: VOC2012


BeaverDam: Video annotation tool for deep learning training labels



Convolutional Neural Networks for Object Detection


Introducing automatic object detection to visual search (Pinterest)

Deep Learning for Object Detection with DIGITS

Analyzing The Papers Behind Facebook’s Computer Vision Approach

Easily Create High Quality Object Detectors with Deep Learning

How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit

Object Detection in Satellite Imagery, a Low Overhead Approach

You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks

Faster R-CNN Pedestrian and Car Detection

Small U-Net for vehicle detection

Region of interest pooling explained

Supercharge your Computer Vision models with the TensorFlow Object Detection API

Understanding SSD MultiBox — Real-Time Object Detection In Deep Learning


One-shot object detection


An overview of object detection: one-stage methods


deep learning object detection