Segmentation
Papers
Deep Joint Task Learning for Generic Object Extraction
- intro: NIPS 2014
- homepage: http://vision.sysu.edu.cn/projects/deep-joint-task-learning/
- paper: http://ss.sysu.edu.cn/~ll/files/NIPS2014_JointTask.pdf
- github: https://github.com/xiaolonw/nips14_loc_seg_testonly
- dataset: http://objectextraction.github.io/
Highly Efficient Forward and Backward Propagation of Convolutional Neural Networks for Pixelwise Classification
- arxiv: https://arxiv.org/abs/1412.4526
- code(Caffe): https://dl.dropboxusercontent.com/u/6448899/caffe.zip
- author page: http://www.ee.cuhk.edu.hk/~hsli/
Segmentation from Natural Language Expressions
- intro: ECCV 2016
- project page: http://ronghanghu.com/text_objseg/
- arxiv: http://arxiv.org/abs/1603.06180
- github(TensorFlow): https://github.com/ronghanghu/text_objseg
- gtihub(Caffe): https://github.com/Seth-Park/text_objseg_caffe
Semantic Object Parsing with Graph LSTM
Fine Hand Segmentation using Convolutional Neural Networks
Feedback Neural Network for Weakly Supervised Geo-Semantic Segmentation
- intro: Facebook Connectivity Lab & Facebook Core Data Science & University of Illinois
- arxiv: https://arxiv.org/abs/1612.02766
FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics
A deep learning model integrating FCNNs and CRFs for brain tumor segmentation
Texture segmentation with Fully Convolutional Networks
- intro: Dublin City University
- arxiv: https://arxiv.org/abs/1703.05230
Fast LIDAR-based Road Detection Using Convolutional Neural Networks
https://arxiv.org/abs/1703.03613
Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs
Annotating Object Instances with a Polygon-RNN
- intro: CVPR 2017. CVPR Best Paper Honorable Mention Award
- intro: University of Toronto
- keywords: PolygonRNN
- project page: http://www.cs.toronto.edu/polyrnn/
- arxiv: https://arxiv.org/abs/1704.05548
Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++
- intro: CVPR 2018
- keywords: PolygonRNN++
- project page: http://www.cs.toronto.edu/polyrnn/
- arxiv: https://arxiv.org/abs/1803.09693
- github: https://github.com/davidjesusacu/polyrnn-pp
Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF
- intro: CVPR 2017
- paper: http://openaccess.thecvf.com/content_cvpr_2017/papers/Shen_Semantic_Segmentation_via_CVPR_2017_paper.pdf
- github(Caffe): https://github.com//FalongShen/SegModel
Distantly Supervised Road Segmentation
- intro: ICCV workshop CVRSUAD2017. Indiana University & Preferred Networks
- arxiv: https://arxiv.org/abs/1708.06118
Ω-Net: Fully Automatic, Multi-View Cardiac MR Detection, Orientation, and Segmentation with Deep Neural Networks
Ω-Net (Omega-Net): Fully Automatic, Multi-View Cardiac MR Detection, Orientation, and Segmentation with Deep Neural Networks
https://arxiv.org/abs/1711.01094
Superpixel clustering with deep features for unsupervised road segmentation
- intro: Preferred Networks, Inc & Indiana University
- arxiv: https://arxiv.org/abs/1711.05998
Learning to Segment Human by Watching YouTube
- intro: TPAMI 2017
- arxiv: https://arxiv.org/abs/1710.01457
W-Net: A Deep Model for Fully Unsupervised Image Segmentation
https://arxiv.org/abs/1711.08506
End-to-end detection-segmentation network with ROI convolution
- intro: ISBI 2018
- arxiv: https://arxiv.org/abs/1801.02722
A Foreground Inference Network for Video Surveillance Using Multi-View Receptive Field
https://arxiv.org/abs/1801.06593
Piecewise Flat Embedding for Image Segmentation
https://arxiv.org/abs/1802.03248
A Pyramid CNN for Dense-Leaves Segmentation
- intro: Computer and Robot Vision, Toronto, May 2018
- arxiv: https://arxiv.org/abs/1804.01646
Capsules for Object Segmentation
- keywords: convolutional-deconvolutional capsule network, SegCaps, U-Net
- arxiv: https://arxiv.org/abs/1804.04241
Deep Object Co-Segmentation
https://arxiv.org/abs/1804.06423
Semantic Aware Attention Based Deep Object Co-segmentation
https://arxiv.org/abs/1810.06859
Contextual Hourglass Networks for Segmentation and Density Estimation
https://arxiv.org/abs/1806.04009
U-Net
U-Net: Convolutional Networks for Biomedical Image Segmentation
- intro: conditionally accepted at MICCAI 2015
- project page: http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/
- arxiv: http://arxiv.org/abs/1505.04597
- code+data: http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/u-net-release-2015-10-02.tar.gz
- github: https://github.com/orobix/retina-unet
- github: https://github.com/jakeret/tf_unet
- notes: http://zongwei.leanote.com/post/Pa
UNet++: A Nested U-Net Architecture for Medical Image Segmentation
- intro: 4th Deep Learning in Medical Image Analysis (DLMIA) Workshop
- arxiv: https://arxiv.org/abs/1807.10165
UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation
- intro: ICASSP 2020
- arxiv: https://arxiv.org/abs/2004.08790
- github: https://github.com/ZJUGiveLab/UNet-Version
DeepUNet: A Deep Fully Convolutional Network for Pixel-level Sea-Land Segmentation
https://arxiv.org/abs/1709.00201
TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation
- intro: Lyft Inc. & MIT
- intro: part of the winning solution (1st out of 735) in the Kaggle: Carvana Image Masking Challenge
- arxiv: https://arxiv.org/abs/1801.05746
- github: https://github.com/ternaus/TernausNet
A Probabilistic U-Net for Segmentation of Ambiguous Images
- intro: DeepMind & German Cancer Research Center
- arxiv: https://arxiv.org/abs/1806.05034
Deep Dual Pyramid Network for Barcode Segmentation using Barcode-30k Database
https://arxiv.org/abs/1807.11886
Deep Smoke Segmentation
https://arxiv.org/abs/1809.00774
Smoothed Dilated Convolutions for Improved Dense Prediction
- intro: KDD 2018
- arxiv: https://arxiv.org/abs/1808.08931
- github: https://github.com/divelab/dilated
DASNet: Reducing Pixel-level Annotations for Instance and Semantic Segmentation
https://arxiv.org/abs/1809.06013
Improving Fast Segmentation With Teacher-student Learning
https://arxiv.org/abs/1810.08476
DSNet: An Efficient CNN for Road Scene Segmentation
https://arxiv.org/abs/1904.05022
Line Segment Detection Using Transformers without Edges
https://arxiv.org/abs/2101.01909
Holistic Segmentation
- intro: Technical University of Munich & BMW Group & Johns Hopkins University & Google
- arxiv: https://arxiv.org/abs/2209.05407
Unified Image Segmentation
K-Net: Towards Unified Image Segmentation
- intro: NeurIPS 2021
- intro: Nanyang Technological University & Chinese University of Hong Kon & SenseTime Research & Shanghai AI Laborator
- project page: https://www.mmlab-ntu.com/project/knet/index.html
- arxiv: https://arxiv.org/abs/2106.14855
- github: https://github.com/ZwwWayne/K-Net/
Masked-attention Mask Transformer for Universal Image Segmentation
- project page: https://bowenc0221.github.io/mask2former/
- arxiv: https://arxiv.org/abs/2112.01527
- github: https://github.com/facebookresearch/Mask2Former
Mask2Former for Video Instance Segmentation
- intro: University of Illinois at Urbana-Champaign (UIUC) & Facebook AI Research (FAIR
- arxiv: https://arxiv.org/abs/2112.10764
- github: https://github.com/facebookresearch/Mask2Former
Foreground Object Segmentation
Pixel Objectness
- project page: http://vision.cs.utexas.edu/projects/pixelobjectness/
- arxiv: https://arxiv.org/abs/1701.05349
- github: https://github.com/suyogduttjain/pixelobjectness
A Deep Convolutional Neural Network for Background Subtraction
Learning Multi-scale Features for Foreground Segmentation
Learning Deep Representations for Semantic Image Parsing: a Comprehensive Overview
https://arxiv.org/abs/1810.04377
Semantic Segmentation
Fully Convolutional Networks for Semantic Segmentation
- intro: CVPR 2015, PAMI 2016
- keywords: deconvolutional layer, crop layer
- arxiv: http://arxiv.org/abs/1411.4038
- arxiv(PAMI 2016): http://arxiv.org/abs/1605.06211
- slides: https://docs.google.com/presentation/d/1VeWFMpZ8XN7OC3URZP4WdXvOGYckoFWGVN7hApoXVnc
- slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-pixels.pdf
- talk: http://techtalks.tv/talks/fully-convolutional-networks-for-semantic-segmentation/61606/
- github(official): https://github.com/shelhamer/fcn.berkeleyvision.org
- github: https://github.com/BVLC/caffe/wiki/Model-Zoo#fcn
- github: https://github.com/MarvinTeichmann/tensorflow-fcn
- github(Chainer): https://github.com/wkentaro/fcn
- github: https://github.com/wkentaro/pytorch-fcn
- github: https://github.com/shekkizh/FCN.tensorflow
- notes: http://zhangliliang.com/2014/11/28/paper-note-fcn-segment/
From Image-level to Pixel-level Labeling with Convolutional Networks
- intro: CVPR 2015
- intro: “Weakly Supervised Semantic Segmentation with Convolutional Networks”
- intro: performs semantic segmentation based only on image-level annotations in a multiple instance learning framework
- arxiv: http://arxiv.org/abs/1411.6228
- paper: http://ronan.collobert.com/pub/matos/2015_semisupsemseg_cvpr.pdf
Feedforward semantic segmentation with zoom-out features
- intro: CVPR 2015. Toyota Technological Institute at Chicago
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Mostajabi_Feedforward_Semantic_Segmentation_2015_CVPR_paper.pdf
- bitbuckt: https://bitbucket.org/m_mostajabi/zoom-out-release
- video: https://www.youtube.com/watch?v=HvgvX1LXQa8
DeepLab
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
- intro: ICLR 2015. DeepLab
- arxiv: http://arxiv.org/abs/1412.7062
- bitbucket: https://bitbucket.org/deeplab/deeplab-public/
- github: https://github.com/TheLegendAli/DeepLab-Context
Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation
- intro: DeepLab
- arxiv: http://arxiv.org/abs/1502.02734
- bitbucket: https://bitbucket.org/deeplab/deeplab-public/
- github: https://github.com/TheLegendAli/DeepLab-Context
DeepLab v2
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
- intro: TPAMI
- intro: 79.7% mIOU in the test set, PASCAL VOC-2012 semantic image segmentation task
- intro: Updated version of our previous ICLR 2015 paper
- project page: http://liangchiehchen.com/projects/DeepLab.html
- arxiv: https://arxiv.org/abs/1606.00915
- bitbucket: https://bitbucket.org/aquariusjay/deeplab-public-ver2
- github: https://github.com/DrSleep/tensorflow-deeplab-resnet
- github: https://github.com/isht7/pytorch-deeplab-resnet
DeepLabv2 (ResNet-101)
http://liangchiehchen.com/projects/DeepLabv2_resnet.html
DeepLab v3
Rethinking Atrous Convolution for Semantic Image Segmentation
- intro: Google. DeepLabv3
- arxiv: https://arxiv.org/abs/1706.05587
DeepLabv3+
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
- intro: Google Inc.
- arxiv: https://arxiv.org/abs/1802.02611
- github: https://github.com/tensorflow/models/tree/master/research/deeplab
- blog: https://research.googleblog.com/2018/03/semantic-image-segmentation-with.html
- github: https://github.com/hualin95/Deeplab-v3plus
DeeperLab
DeeperLab: Single-Shot Image Parser
- intro: MIT & Google Inc. & UC Berkeley
- arxiv: https://arxiv.org/abs/1902.05093
Auto-DeepLab
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
- intro: CVPR 2019 oral
- intro: Johns Hopkins University & Google & Stanford University
- arxiv: https://arxiv.org/abs/1901.02985
- github: https://github.com/tensorflow/models/tree/master/research/deeplab
Conditional Random Fields as Recurrent Neural Networks
- intro: ICCV 2015
- intro: Oxford / Stanford / Baidu
- keywords: CRF-RNN
- project page: http://www.robots.ox.ac.uk/~szheng/CRFasRNN.html
- arxiv: http://arxiv.org/abs/1502.03240
- github: https://github.com/torrvision/crfasrnn
- demo: http://www.robots.ox.ac.uk/~szheng/crfasrnndemo
- github: https://github.com/martinkersner/train-CRF-RNN
BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation
Efficient piecewise training of deep structured models for semantic segmentation
- intro: CVPR 2016
- arxiv: http://arxiv.org/abs/1504.01013
Learning Deconvolution Network for Semantic Segmentation
- intro: ICCV 2015
- intro: two-stage training: train the network with easy examples first and fine-tune the trained network with more challenging examples later
- keywords: DeconvNet
- project page: http://cvlab.postech.ac.kr/research/deconvnet/
- arxiv: http://arxiv.org/abs/1505.04366
- slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w06-deconvnet.pdf
- gitxiv: http://gitxiv.com/posts/9tpJKNTYksN5eWcHz/learning-deconvolution-network-for-semantic-segmentation
- github: https://github.com/HyeonwooNoh/DeconvNet
- github: https://github.com/HyeonwooNoh/caffe
SegNet
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling
- arxiv: http://arxiv.org/abs/1505.07293
- github: https://github.com/alexgkendall/caffe-segnet
- github: https://github.com/pfnet-research/chainer-segnet
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
- homepage: http://mi.eng.cam.ac.uk/projects/segnet/
- arxiv: http://arxiv.org/abs/1511.00561
- github: https://github.com/alexgkendall/caffe-segnet
- tutorial: http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html
SegNet: Pixel-Wise Semantic Labelling Using a Deep Networks
Getting Started with SegNet
- blog: http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html
- github: https://github.com/alexgkendall/SegNet-Tutorial
ParseNet: Looking Wider to See Better
- intro:ICLR 2016
- arxiv: http://arxiv.org/abs/1506.04579
- github: https://github.com/weiliu89/caffe/tree/fcn
- caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#parsenet-looking-wider-to-see-better
Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation
- intro: ICLR 2016
- keywords: DecoupledNet
- project(paper+code): http://cvlab.postech.ac.kr/research/decouplednet/
- arxiv: http://arxiv.org/abs/1506.04924
- github: https://github.com/HyeonwooNoh/DecoupledNet
Semantic Image Segmentation via Deep Parsing Network
- intro: ICCV 2015. CUHK
- keywords: Deep Parsing Network (DPN), Markov Random Field (MRF)
- homepage: http://personal.ie.cuhk.edu.hk/~lz013/projects/DPN.html
- arxiv.org: http://arxiv.org/abs/1509.02634
- paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Liu_Semantic_Image_Segmentation_ICCV_2015_paper.pdf
- slides: http://personal.ie.cuhk.edu.hk/~pluo/pdf/presentation_dpn.pdf
Multi-Scale Context Aggregation by Dilated Convolutions
- intro: ICLR 2016.
- intro: Dilated Convolution for Semantic Image Segmentation
- homepage: http://vladlen.info/publications/multi-scale-context-aggregation-by-dilated-convolutions/
- arxiv: http://arxiv.org/abs/1511.07122
- github: https://github.com/fyu/dilation
- github: https://github.com/nicolov/segmentation_keras
- notes: http://www.inference.vc/dilated-convolutions-and-kronecker-factorisation/
Instance-aware Semantic Segmentation via Multi-task Network Cascades
- intro: CVPR 2016 oral. 1st-place winner of MS COCO 2015 segmentation competition
- keywords: RoI warping layer, Multi-task Network Cascades (MNC)
- arxiv: http://arxiv.org/abs/1512.04412
- github: https://github.com/daijifeng001/MNC
Object Segmentation on SpaceNet via Multi-task Network Cascades (MNC)
- blog: https://medium.com/the-downlinq/object-segmentation-on-spacenet-via-multi-task-network-cascades-mnc-f1c89d790b42
- github: https://github.com/lncohn/pascal_to_spacenet
Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network
- intro: TransferNet
- project page: http://cvlab.postech.ac.kr/research/transfernet/
- arxiv: http://arxiv.org/abs/1512.07928
- github: https://github.com/maga33/TransferNet
Combining the Best of Convolutional Layers and Recurrent Layers: A Hybrid Network for Semantic Segmentation
Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation
- intro: ECCV 2016
- arxiv: https://arxiv.org/abs/1603.06098
- github: https://github.com/kolesman/SEC
ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation
- project page: http://research.microsoft.com/en-us/um/people/jifdai/downloads/scribble_sup/
- arxiv: http://arxiv.org/abs/1604.05144
Laplacian Reconstruction and Refinement for Semantic Segmentation
Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation
- intro: ECCV 2016
- arxiv: https://arxiv.org/abs/1605.02264
- paper: https://www.ics.uci.edu/~fowlkes/papers/gf-eccv16.pdf
- github(MatConvNet): https://github.com/golnazghiasi/LRR
Natural Scene Image Segmentation Based on Multi-Layer Feature Extraction
Convolutional Random Walk Networks for Semantic Image Segmentation
ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
- arxiv: http://arxiv.org/abs/1606.02147
- github: https://github.com/e-lab/ENet-training
- github(Caffe): https://github.com/TimoSaemann/ENet
- github: https://github.com/PavlosMelissinos/enet-keras
- github: https://github.com/kwotsin/TensorFlow-ENet
- blog: http://culurciello.github.io/tech/2016/06/20/training-enet.html
Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery
Deep Learning Markov Random Field for Semantic Segmentation
Region-based semantic segmentation with end-to-end training
- intro: ECCV 2016
- arxiv: http://arxiv.org/abs/1607.07671
- githun: https://github.com/nightrome/matconvnet-calvin
Built-in Foreground/Background Prior for Weakly-Supervised Semantic Segmentation
- intro: ECCV 2016
- arxiv: http://arxiv.org/abs/1609.00446
PixelNet: Towards a General Pixel-level Architecture
- intro: semantic segmentation, edge detection
- arxiv: http://arxiv.org/abs/1609.06694
Exploiting Depth from Single Monocular Images for Object Detection and Semantic Segmentation
- intro: IEEE T. Image Processing
- intro: propose an RGB-D semantic segmentation method which applies a multi-task training scheme: semantic label prediction and depth value regression
- arxiv: https://arxiv.org/abs/1610.01706
PixelNet: Representation of the pixels, by the pixels, and for the pixels
- intro: CMU & Adobe Research
- project page: http://www.cs.cmu.edu/~aayushb/pixelNet/
- arxiv: https://arxiv.org/abs/1702.06506
- github(Caffe): https://github.com/aayushbansal/PixelNet
Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks
Deep Structured Features for Semantic Segmentation
CNN-aware Binary Map for General Semantic Segmentation
- intro: ICIP 2016 Best Paper / Student Paper Finalist
- arxiv: https://arxiv.org/abs/1609.09220
Efficient Convolutional Neural Network with Binary Quantization Layer
Mixed context networks for semantic segmentation
- intro: Hikvision Research Institute
- arxiv: https://arxiv.org/abs/1610.05854
High-Resolution Semantic Labeling with Convolutional Neural Networks
Gated Feedback Refinement Network for Dense Image Labeling
- intro: CVPR 2017
- paper: http://www.cs.umanitoba.ca/~ywang/papers/cvpr17.pdf
RefineNet: Multi-Path Refinement Networks with Identity Mappings for High-Resolution Semantic Segmentation
RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
- intro: CVPR 2017. IoU 83.4% on PASCAL VOC 2012
- arxiv: https://arxiv.org/abs/1611.06612
- github: https://github.com/guosheng/refinenet
- leaderboard: http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6#KEY_Multipath-RefineNet-Res152
Light-Weight RefineNet for Real-Time Semantic Segmentation
- intro: BMVC 2018
- arxiv: https://arxiv.org/abs/1810.03272
- github: https://github.com/drsleep/light-weight-refinenet
Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes
- keywords: Full-Resolution Residual Units (FRRU), Full-Resolution Residual Networks (FRRNs)
- arxiv: https://arxiv.org/abs/1611.08323
- github(Theano/Lasagne): https://github.com/TobyPDE/FRRN
- youtube: https://www.youtube.com/watch?v=PNzQ4PNZSzc
Semantic Segmentation using Adversarial Networks
- intro: Facebook AI Research & INRIA. NIPS Workshop on Adversarial Training, Dec 2016, Barcelona, Spain
- arxiv: https://arxiv.org/abs/1611.08408
- github(Chainer): https://github.com/oyam/Semantic-Segmentation-using-Adversarial-Networks
Improving Fully Convolution Network for Semantic Segmentation
The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation
- intro: Montreal Institute for Learning Algorithms & Ecole Polytechnique de Montreal
- arxiv: https://arxiv.org/abs/1611.09326
- github: https://github.com/SimJeg/FC-DenseNet
- github: https://github.com/titu1994/Fully-Connected-DenseNets-Semantic-Segmentation
- github(Keras): https://github.com/0bserver07/One-Hundred-Layers-Tiramisu
Training Bit Fully Convolutional Network for Fast Semantic Segmentation
- intro: Megvii
- arxiv: https://arxiv.org/abs/1612.00212
Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection
- intro: “an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. “
- arxiv: https://arxiv.org/abs/1612.01337
Diverse Sampling for Self-Supervised Learning of Semantic Segmentation
Mining Pixels: Weakly Supervised Semantic Segmentation Using Image Labels
- intro: Nankai University & University of Oxford & NUS
- arxiv: https://arxiv.org/abs/1612.02101
FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation
Understanding Convolution for Semantic Segmentation
- intro: UCSD & CMU & UIUC & TuSimple
- arxiv: https://arxiv.org/abs/1702.08502
- github(MXNet): [https://github.com/TuSimple/TuSimple-DUC]https://github.com/TuSimple/TuSimple-DUC
- pretrained-models: https://drive.google.com/drive/folders/0B72xLTlRb0SoREhISlhibFZTRmM
Label Refinement Network for Coarse-to-Fine Semantic Segmentation
https://www.arxiv.org/abs/1703.00551
Predicting Deeper into the Future of Semantic Segmentation
- intro: Facebook AI Research
- arxiv: https://arxiv.org/abs/1703.07684
Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach
- intro: CVPR 2017 (oral)
- keywords: Adversarial Erasing (AE)
- arxiv: https://arxiv.org/abs/1703.08448
Guided Perturbations: Self Corrective Behavior in Convolutional Neural Networks
- intro: University of Maryland & GE Global Research Center
- arxiv: https://arxiv.org/abs/1703.07928
Not All Pixels Are Equal: Difficulty-aware Semantic Segmentation via Deep Layer Cascade
- intro: CVPR 2017 spotlight paper
- arxxiv: https://arxiv.org/abs/1704.01344
Large Kernel Matters – Improve Semantic Segmentation by Global Convolutional Network
https://arxiv.org/abs/1703.02719
Loss Max-Pooling for Semantic Image Segmentation
- intro: CVPR 2017
- arxiv: https://arxiv.org/abs/1704.02966
Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation
https://arxiv.org/abs/1704.03593
A Review on Deep Learning Techniques Applied to Semantic Segmentation
https://arxiv.org/abs/1704.06857
Joint Semantic and Motion Segmentation for dynamic scenes using Deep Convolutional Networks
- intro: [International Institute of Information Technology & Max Planck Institute For Intelligent Systems
- arxiv: https://arxiv.org/abs/1704.08331
ICNet for Real-Time Semantic Segmentation on High-Resolution Images
- intro: CUHK & Sensetime
- project page: https://hszhao.github.io/projects/icnet/
- arxiv: https://arxiv.org/abs/1704.08545
- github: https://github.com/hszhao/ICNet
- video: https://www.youtube.com/watch?v=qWl9idsCuLQ
Feature Forwarding: Exploiting Encoder Representations for Efficient Semantic Segmentation
LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation
- project page: https://codeac29.github.io/projects/linknet/
- arxiv: https://arxiv.org/abs/1707.03718
- github: https://github.com/e-lab/LinkNet
Pixel Deconvolutional Networks
- intro: Washington State University
- arxiv: https://arxiv.org/abs/1705.06820
Incorporating Network Built-in Priors in Weakly-supervised Semantic Segmentation
- intro: IEEE TPAMI
- arxiv: https://arxiv.org/abs/1706.02189
Deep Semantic Segmentation for Automated Driving: Taxonomy, Roadmap and Challenges
- intro: IEEE ITSC 2017
- arxiv: https://arxiv.org/abs/1707.02432
Semantic Segmentation with Reverse Attention
- intro: BMVC 2017 oral. University of Southern California
- arxiv: https://arxiv.org/abs/1707.06426
Stacked Deconvolutional Network for Semantic Segmentation
https://arxiv.org/abs/1708.04943
Learning Dilation Factors for Semantic Segmentation of Street Scenes
- intro: GCPR 2017
- arxiv: https://arxiv.org/abs/1709.01956
A Self-aware Sampling Scheme to Efficiently Train Fully Convolutional Networks for Semantic Segmentation
https://arxiv.org/abs/1709.02764
One-Shot Learning for Semantic Segmentation
- intro: BMWC 2017
- arcxiv: https://arxiv.org/abs/1709.03410
- github: https://github.com/lzzcd001/OSLSM
An Adaptive Sampling Scheme to Efficiently Train Fully Convolutional Networks for Semantic Segmentation
https://arxiv.org/abs/1709.02764
Semantic Segmentation from Limited Training Data
https://arxiv.org/abs/1709.07665
Unsupervised Domain Adaptation for Semantic Segmentation with GANs
https://arxiv.org/abs/1711.06969
Neuron-level Selective Context Aggregation for Scene Segmentation
https://arxiv.org/abs/1711.08278
Road Extraction by Deep Residual U-Net
https://arxiv.org/abs/1711.10684
Mix-and-Match Tuning for Self-Supervised Semantic Segmentation
- intro: AAAI 2018
- project page: http://mmlab.ie.cuhk.edu.hk/projects/M&M/
- arxiv: https://arxiv.org/abs/1712.00661
- github: https://github.com/XiaohangZhan/mix-and-match/
- github: https://github.com//liuziwei7/mix-and-match
Error Correction for Dense Semantic Image Labeling
https://arxiv.org/abs/1712.03812
Semantic Segmentation via Highly Fused Convolutional Network with Multiple Soft Cost Functions
https://arxiv.org/abs/1801.01317
RTSeg: Real-time Semantic Segmentation Comparative Study
ShuffleSeg: Real-time Semantic Segmentation Network
- intro: Cairo University
- arxiv: https://arxiv.org/abs/1803.03816
Dynamic-structured Semantic Propagation Network
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1803.06067
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
- project page: https://sacmehta.github.io/ESPNet/
- arxiv: https://arxiv.org/abs/1803.06815
- github: https://github.com/sacmehta/ESPNet
Context Encoding for Semantic Segmentation
- intro: CVPR 2018
- keywords: Synchronized Cross-GPU Batch Normalization
- arxiv: https://arxiv.org/abs/1803.08904
- github: https://github.com/zhanghang1989/PyTorch-Encoding
Adaptive Affinity Field for Semantic Segmentation
- intro: UC Berkeley / ICSI
- arxiv: https://arxiv.org/abs/1803.10335
Predicting Future Instance Segmentations by Forecasting Convolutional Features
- intro: Facebook AI Research & Univ. Grenoble Alpes
- arxiv: https://arxiv.org/abs/1803.11496
Fully Convolutional Adaptation Networks for Semantic Segmentation
- intro: CVPR 2018, Rank 1 in Segmentation Track of Visual Domain Adaptation Challenge 2017
- keywords: Fully Convolutional Adaptation Networks (FCAN), Appearance Adaptation Networks (AAN) and Representation Adaptation Networks (RAN)
- arxiv: https://arxiv.org/abs/1804.08286
Learning a Discriminative Feature Network for Semantic Segmentation
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1804.09337
Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation
https://arxiv.org/abs/1805.04141
Convolutional CRFs for Semantic Segmentation
ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time
- intro: Toshiba Research
- arxiv: https://arxiv.org/abs/1805.04554
DifNet: Semantic Segmentation by DiffusionNetworks
https://arxiv.org/abs/1805.08015
Pyramid Attention Network for Semantic Segmentation
https://arxiv.org/abs/1805.10180
Semantic Segmentation with Scarce Data
- intro: ICML 2018 Workshop
- arxiv: https://arxiv.org/abs/1807.00911
Attention to Refine through Multi-Scales for Semantic Segmentation
https://arxiv.org/abs/1807.02917
Guided Upsampling Network for Real-Time Semantic Segmentation
- intro: BMVC 2018
- arxiv: https://arxiv.org/abs/1807.07466
Deep Learning for Semantic Segmentation on Minimal Hardware
- intro: RoboCup International Symposium 2018. University of Hertfordshire
- arxiv: https://arxiv.org/abs/1807.05597
Future Semantic Segmentation with Convolutional LSTM
- intro: BMVC 2018
- arxiv: https://arxiv.org/abs/1807.07946
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1808.00897
Dual Attention Network for Scene Segmentation
https://arxiv.org/abs/1809.02983
Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations
https://arxiv.org/abs/1809.04766
Efficient Dense Modules of Asymmetric Convolution for Real-Time Semantic Segmentation
https://arxiv.org/abs/1809.06323
Semantic Image Segmentation by Scale-Adaptive Networks
- github(Caffe): https://github.com/speedinghzl/Scale-Adaptive-Network
Recurrent Iterative Gating Networks for Semantic Segmentation
- intro: WACV 2019
- arxiv: https://arxiv.org/abs/1811.08043
CGNet: A Light-weight Context Guided Network for Semantic Segmentation
CCNet: Criss-Cross Attention for Semantic Segmentation
- intro: Huazhong University of Science and Technology & Horizon Robotics & University of Illinois at Urbana-Champaign
- arxiv: https://arxiv.org/abs/1811.11721
- github: https://github.com/speedinghzl/CCNet
ShelfNet for Real-time Semantic Segmentation
- intro: Yale University
- arxiv: https://arxiv.org/abs/1811.11254
- github: https://github.com/juntang-zhuang/ShelfNet
Improving Semantic Segmentation via Video Propagation and Label Relaxation
- intro: CVPR 2019 oral
- arxiv: https://arxiv.org/abs/1812.01593
- github: https://github.com/NVIDIA/semantic-segmentation
RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free
Fast-SCNN: Fast Semantic Segmentation Network
https://arxiv.org/abs/1902.04502
Structured Knowledge Distillation for Semantic Segmentation
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1903.04197
In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images
- intro: CVPR 2019
- intro: University of Zagreb
- keywords: SwiftNet
- arxiv: https://arxiv.org/abs/1903.08469
- github: https://github.com/orsic/swiftnet
FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation
- intro: Chinese Academy of Sciences & Deepwise AI Lab
- keywords: Joint Pyramid Upsampling (JPU)
- project page: http://wuhuikai.me/FastFCNProject/
- arxiv: https://arxiv.org/abs/1903.11816
- github: https://github.com/wuhuikai/FastFCN
Significance-aware Information Bottleneck for Domain Adaptive Semantic Segmentation
- intro: HUST & UTS
- arxiv: https://arxiv.org/abs/1904.00876
GFF: Gated Fully Fusion for Semantic Segmentation
https://arxiv.org/abs/1904.01803
DADA: Depth-aware Domain Adaptation in Semantic Segmentation
https://arxiv.org/abs/1904.01886
DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation
- intro: Megvii Technology
- arxiv: https://arxiv.org/abs/1904.02216
ESNet: An Efficient Symmetric Network for Real-time Semantic Segmentation
- arxiv: https://arxiv.org/abs/1906.09826
- github(official): https://github.com/xiaoyufenfei/ESNet
Gated-SCNN: Gated Shape CNNs for Semantic Segmentation
- intro: NVIDIA & University of Waterloo & University of Toronto & Vector Institute
- project page: https://nv-tlabs.github.io/GSCNN/
- arxiv: https://arxiv.org/abs/1907.05740
DABNet: Depth-wise Asymmetric Bottleneck for Real-time Semantic Segmentation
- intro: BMVC 2019
- arxiv: https://arxiv.org/abs/1907.11830
Dynamic Graph Message Passing Networks
- intro: CVPR 2020 oral
- arxiv: https://arxiv.org/abs/1908.06955
Squeeze-and-Attention Networks for Semantic Segmentation
https://arxiv.org/abs/1909.03402
Global Aggregation then Local Distribution in Fully Convolutional Networks
- intro: BMVC 2019
- arxiv: https://arxiv.org/abs/1909.07229
- github: https://github.com/lxtGH/GALD-Net
Graph-guided Architecture Search for Real-time Semantic Segmentation
https://arxiv.org/abs/1909.06793
Feature Pyramid Encoding Network for Real-time Semantic Segmentation
- intro: BMVC 2019
- arxiv: https://arxiv.org/abs/1909.08599
ACFNet: Attentional Class Feature Network for Semantic Segmentation
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1909.09408
Region Mutual Information Loss for Semantic Segmentation
- intro: NeurIPS 2019
- arxiv: https://arxiv.org/abs/1910.12037
- github: https://github.com/ZJULearning/RMI
Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation
- intro: NeurIPS 2019
- arxiv: https://arxiv.org/abs/1910.13049
- github: https://github.com/RogerZhangzz/CAG_UDA
Efficacy of Pixel-Level OOD Detection for Semantic Segmentation
https://arxiv.org/abs/1911.02897
Location-aware Upsampling for Semantic Segmentation
- keywords: LaU
- arxiv: https://arxiv.org/abs/1911.05250
- github: https://github.com/HolmesShuan/Location-aware-Upsampling-for-Semantic-Segmentation
FasterSeg: Searching for Faster Real-time Semantic Segmentation
- intro: ICLR 2020
- intro: Texas A&M University & Horizon Robotics Inc.
- arxiv: https://arxiv.org/abs/1912.10917
AlignSeg: Feature-Aligned Segmentation Networks
https://arxiv.org/abs/2003.00872
Deep Grouping Model for Unified Perceptual Parsing
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/2003.11647
Spatial Pyramid Based Graph Reasoning for Semantic Segmentation
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/2003.10211
Learning Dynamic Routing for Semantic Segmentation
- intro: CVPR 2020 oral
- arxiv: https://arxiv.org/abs/2003.10401
- giihub(official): https://github.com/yanwei-li/DynamicRouting
Learning to Predict Context-adaptive Convolution for Semantic Segmentation
https://arxiv.org/abs/2004.08222
Transferring and Regularizing Prediction for Semantic Segmentation
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/2006.06570
Tensor Low-Rank Reconstruction for Semantic Segmentation
- intro: ECCV 2020
- intro: Top-1 performance on PASCAL-VOC12
- arxiv: https://arxiv.org/abs/2008.00490
- github: https://github.com/CWanli/RecoNet
Representative Graph Neural Network
- intro: ECCV 2020
- arxiv: https://arxiv.org/abs/2008.05202
EfficientFCN: Holistically-guided Decoding for Semantic Segmentation
https://arxiv.org/abs/2008.10487
Improving Semantic Segmentation via Decoupled Body and Edge Supervision
- intro: ECCV 2020
- arxiv: https://arxiv.org/abs/2007.10035
- github: https://github.com/lxtGH/DecoupleSegNets
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation
https://arxiv.org/abs/2010.07930
PseudoSeg: Designing Pseudo Labels for Semantic Segmentation
Importance-Aware Semantic Segmentation in Self-Driving with Discrete Wasserstein Training
- intro: AAAI 2020
- arxiv: https://arxiv.org/abs/2010.12440
Pixel-Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation
- intro: NeurIPS 2020 oral
- arxiv: https://arxiv.org/abs/2011.00147
- github: https://github.com/kgl-prml/Pixel-Level-Cycle-Association
CABiNet: Efficient Context Aggregation Network for Low-Latency Semantic Segmentation
- intro: University of Twente
- arxiv: https://arxiv.org/abs/2011.00993
SegBlocks: Block-Based Dynamic Resolution Networks for Real-Time Segmentation
https://arxiv.org/abs/2011.12025
Channel-wise Distillation for Semantic Segmentation
- arxiv: https://arxiv.org/abs/2011.13256
- github: https://github.com/drilistbox
BoxInst: High-Performance Instance Segmentation with Box Annotations
- intro: University of Adelaide
- arxiv: https://arxiv.org/abs/2012.02310
- github: https://github.com/aim-uofa/AdelaiDet/
Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU
Cross-Domain Grouping and Alignment for Domain Adaptive Semantic Segmentation
- intro: AAAI 2021
- arxiv: https://arxiv.org/abs/2012.08226
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation
- intro: Facebook AI & Tel Aviv University
- arxiv: https://arxiv.org/abs/2012.11582
SETR
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
- intro: CVPR 2021
- intro: Fudan University & University of Oxford & University of Surrey & Tencent Youtu Lab & Facebook AI
- project page: https://fudan-zvg.github.io/SETR/
- arxiv: https://arxiv.org/abs/2012.15840
- github: https://github.com/fudan-zvg/SETR
Exploring Cross-Image Pixel Contrast for Semantic Segmentation
- intro: ICCV 2021 oral
- intro: Computer Vision Lab, ETH Zurich & SenseTime Research
- arxiv: https://arxiv.org/abs/2101.11939
- github: https://github.com/tfzhou/ContrastiveSeg
Active Boundary Loss for Semantic Segmentation
https://arxiv.org/abs/2102.02696
Learning Statistical Texture for Semantic Segmentation
- intro: CVPR 2021
- intro: Beihang University & SenseTime Research
- arxiv: https://arxiv.org/abs/2103.04133
Cross-Dataset Collaborative Learning for Semantic Segmentation
- intro: CVPR 2021
- intro: Xilinx Inc. & Chinese Academy of Sciences
- arxiv: https://arxiv.org/abs/2103.11351
Vision Transformers for Dense Prediction
- intro: Intel Labs
- arxiv: https://arxiv.org/abs/2103.13413
- github: https://github.com/intel-isl/DPT
InverseForm: A Loss Function for Structured Boundary-Aware Segmentation
- intro: CVPR 2021 oral
- intro: Qualcomm AI Research
- arxiv: https://arxiv.org/abs/2104.02745
Rethinking BiSeNet For Real-time Semantic Segmentation
- intro: Meituan
- intro: CVPR 2021
- arxiv: https://arxiv.org/abs/2104.13188
- github: https://github.com/MichaelFan01/STDC-Seg
Segmenter: Transformer for Semantic Segmentation
- intro: Inria
- arxiv: https://arxiv.org/abs/2105.05633
- github: https://github.com/rstrudel/segmenter
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
https://arxiv.org/abs/2105.15203
Per-Pixel Classification is Not All You Need for Semantic Segmentation
- keywords: UIUC & FAIR
- project page: https://bowenc0221.github.io/maskformer/
- arxiv: https://arxiv.org/abs/2107.06278
A Unified Efficient Pyramid Transformer for Semantic Segmentation
- intro: School of Data Science, Fudan University & Amazon Web Services & University of California, Davis
- arxiv: https://arxiv.org/abs/2107.14209
Deep Metric Learning for Open World Semantic Segmentation
- intro: ICCV 2021
- arxiv: https://arxiv.org/abs/2108.04562
Multi-Anchor Active Domain Adaptation for Semantic Segmentation
- intro: ICCV 2021 Oral
- arxiv: https://arxiv.org/abs/2108.08012
Generalize then Adapt: Source-Free Domain Adaptive Semantic Segmentation
- intro: ICCV 2021
- intro: Indian Institute of Science & Google Research
- project page: https://sites.google.com/view/sfdaseg
- arxiv: https://arxiv.org/abs/2108.11249
HRFormer: High-Resolution Transformer for Dense Prediction
- intro: NeurIPS 2021
- intro: University of Chinese Academy of Sciences & Institute of Computing Technology, CAS & Peking University & Microsoft Research Asia & Baidu
- arxiv: https://arxiv.org/abs/2110.09408
- github: https://github.com/HRNet/HRFormer
Deep Hierarchical Semantic Segmentation
- intro: CVPR 2022
- arxiv: https://arxiv.org/abs/2203.14335
- github: https://github.com/0liliulei/HieraSeg
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
- intro: CVPR 2022
- arxiv: https://arxiv.org/abs/2204.05525
- github: https://github.com/hustvl/TopFormer
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
- intro: The Hong Kong University of Science and Technology & Tsinghua University & International Digital Economy Academy (IDEA) & The Hong Kong University of Science and Technology (Guangzhou)
- arxiv: https://arxiv.org/abs/2206.02777
- github: https://github.com/IDEACVR/MaskDINO
Instance Segmentation
Simultaneous Detection and Segmentation
- intro: ECCV 2014
- author: Bharath Hariharan, Pablo Arbelaez, Ross Girshick, Jitendra Malik
- arxiv: http://arxiv.org/abs/1407.1808
- github(Matlab): https://github.com/bharath272/sds_eccv2014
Convolutional Feature Masking for Joint Object and Stuff Segmentation
- intro: CVPR 2015
- keywords: masking layers
- arxiv: https://arxiv.org/abs/1412.1283
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Dai_Convolutional_Feature_Masking_2015_CVPR_paper.pdf
Proposal-free Network for Instance-level Object Segmentation
Hypercolumns for object segmentation and fine-grained localization
- intro: CVPR 2015
- arxiv: https://arxiv.org/abs/1411.5752
- paper: http://www.cs.berkeley.edu/~bharath2/pubs/pdfs/BharathCVPR2015.pdf
SDS using hypercolumns
Learning to decompose for object detection and instance segmentation
- intro: ICLR 2016 Workshop
- keyword: CNN / RNN, MNIST, KITTI
- arxiv: http://arxiv.org/abs/1511.06449
Recurrent Instance Segmentation
- intro: ECCV 2016
- porject page: http://romera-paredes.com/ris
- arxiv: http://arxiv.org/abs/1511.08250
- github(Torch): https://github.com/bernard24/ris
- poster: http://www.eccv2016.org/files/posters/P-4B-46.pdf
- youtube: https://www.youtube.com/watch?v=l_WD2OWOqBk
Instance-sensitive Fully Convolutional Networks
- intro: ECCV 2016. instance segment proposal
- arxiv: http://arxiv.org/abs/1603.08678
Amodal Instance Segmentation
- intro: ECCV 2016
- arxiv: http://arxiv.org/abs/1604.08202
Bridging Category-level and Instance-level Semantic Image Segmentation
- keywords: online bootstrapping
- arxiv: http://arxiv.org/abs/1605.06885
Bottom-up Instance Segmentation using Deep Higher-Order CRFs
- intro: BMVC 2016
- arxiv: http://arxiv.org/abs/1609.02583
DeepCut: Object Segmentation from Bounding Box Annotations using Convolutional Neural Networks
End-to-End Instance Segmentation and Counting with Recurrent Attention
- intro: ReInspect
- arxiv: http://arxiv.org/abs/1605.09410
Translation-aware Fully Convolutional Instance Segmentation
Fully Convolutional Instance-aware Semantic Segmentation
- intro: CVPR 2017 Spotlight paper. winning entry of COCO segmentation challenge 2016
- keywords: TA-FCN / FCIS
- arxiv: https://arxiv.org/abs/1611.07709
- github: https://github.com/msracver/FCIS
- slides: https://onedrive.live.com/?cid=f371d9563727b96f&id=F371D9563727B96F%2197213&authkey=%21AEYOyOirjIutSVk
InstanceCut: from Edges to Instances with MultiCut
Deep Watershed Transform for Instance Segmentation
Object Detection Free Instance Segmentation With Labeling Transformations
Shape-aware Instance Segmentation
Interpretable Structure-Evolving LSTM
- intro: CMU & Sun Yat-sen University & National University of Singapore & Adobe Research
- intro: CVPR 2017 spotlight paper
- arxiv: https://arxiv.org/abs/1703.03055
Mask R-CNN
- intro: ICCV 2017 Best paper award. Facebook AI Research
- arxiv: https://arxiv.org/abs/1703.06870
- slides: http://kaiminghe.com/iccv17tutorial/maskrcnn_iccv2017_tutorial_kaiminghe.pdf
- github(official, Caffe2): https://github.com/facebookresearch/Detectron
- github: https://github.com/facebookresearch/maskrcnn-benchmark
- github: https://github.com/TuSimple/mx-maskrcnn
- slides: https://lmb.informatik.uni-freiburg.de/lectures/seminar_brox/seminar_ss17/maskrcnn_slides.pdf
- github(Keras+TensorFlow): https://github.com/matterport/Mask_RCNN
Faster Training of Mask R-CNN by Focusing on Instance Boundaries
- intro: BMW Car IT GmbH
- arxiv: https://arxiv.org/abs/1809.07069
Boundary-preserving Mask R-CNN
- intro: ECCV 2020
- intro: Huazhong University of Science and Technology & Horizon Robotics Inc.
- arxiv: https://arxiv.org/abs/2007.08921
- github: https://github.com/hustvl/BMaskR-CNN
Semantic Instance Segmentation via Deep Metric Learning
https://arxiv.org/abs/1703.10277
Pose2Instance: Harnessing Keypoints for Person Instance Segmentation
https://arxiv.org/abs/1704.01152
Pixelwise Instance Segmentation with a Dynamically Instantiated Network
- intro: CVPR 2017
- arxiv: https://arxiv.org/abs/1704.02386
Instance-Level Salient Object Segmentation
- intro: CVPR 2017
- arxiv: https://arxiv.org/abs/1704.03604
MEnet: A Metric Expression Network for Salient Object Segmentation
- intro: IJCAI
- arxiv: https://arxiv.org/abs/1805.05638
Semantic Instance Segmentation with a Discriminative Loss Function
- intro: Published at “Deep Learning for Robotic Vision”, workshop at CVPR 2017
- arxiv: https://arxiv.org/abs/1708.02551
- github: https://github.com/Wizaron/instance-segmentation-pytorch
SceneCut: Joint Geometric and Object Segmentation for Indoor Scenes
https://arxiv.org/abs/1709.07158
S4 Net: Single Stage Salient-Instance Segmentation
Deep Extreme Cut: From Extreme Points to Object Segmentation
https://arxiv.org/abs/1711.09081
Learning to Segment Every Thing
- intro: CVPR 2018. UC Berkeley & Facebook AI Research
- keywords: MaskX R-CNN
- project page: http://ronghanghu.com/seg_every_thing/
- arxiv: https://arxiv.org/abs/1711.10370
- gihtub(official, Caffe2): https://github.com/ronghanghu/seg_every_thing
Recurrent Neural Networks for Semantic Instance Segmentation
- project page: https://imatge-upc.github.io/rsis/
- arxiv: https://arxiv.org/abs/1712.00617
- github: https://github.com/imatge-upc/rsis
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
- intro: Google Inc. & RWTH Aachen University & UCLA
- arxiv: https://arxiv.org/abs/1712.04837
Recurrent Pixel Embedding for Instance Grouping
- intro: learning to embed pixels and group them into boundaries, object proposals, semantic segments and instances.
- project page: http://www.ics.uci.edu/~skong2/SMMMSG.html
- arxiv: https://arxiv.org/abs/1712.08273
- github: https://github.com/aimerykong/Recurrent-Pixel-Embedding-for-Instance-Grouping
- slides: http://www.ics.uci.edu/~skong2/slides/pixel_embedding_for_grouping_public_version.pdf
- poster: http://www.ics.uci.edu/~skong2/slides/pixel_embedding_for_grouping_poster.pdf
Annotation-Free and One-Shot Learning for Instance Segmentation of Homogeneous Object Clusters
https://arxiv.org/abs/1802.00383
Path Aggregation Network for Instance Segmentation
- intro: CVPR 2018 Spotlight
- intro: CUHK & Peking University & SenseTime Research & YouTu Lab
- keywords: PANet
- arxiv: https://arxiv.org/abs/1803.01534
- github: https://github.com/ShuLiu1993/PANet
Learning to Segment via Cut-and-Paste
- intro: Google
- keywords: weakly-supervised, adversarial learning setup
- arxiv: https://arxiv.org/abs/1803.06414
Learning to Cluster for Proposal-Free Instance Segmentation
https://arxiv.org/abs/1803.06459
Bayesian Semantic Instance Segmentation in Open Set World
https://arxiv.org/abs/1806.00911
TernausNetV2: Fully Convolutional Network for Instance Segmentation
Dynamic Multimodal Instance Segmentation guided by natural language queries
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1807.02257
- github: https://github.com/andfoy/query-objseg
Traits & Transferability of Adversarial Examples against Instance Segmentation & Object Detection
https://arxiv.org/abs/1808.01452
Affinity Derivation and Graph Merge for Instance Segmentation
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1811.10870
One-Shot Instance Segmentation
- intro: University of Tubingen
- arxiv: https://arxiv.org/abs/1811.11507
Hybrid Task Cascade for Instance Segmentation
- intro: CVPR 2019
- intro: The Chinese University of Hong Kong & SenseTime Research & Zhejiang University & The University of Sydney & Nanyang Technological University
- intro: Winning entry of COCO 2018 Challenge (object detection task)
- arxiv: https://arxiv.org/abs/1901.07518
- github(mmdetection): https://github.com/open-mmlab/mmdetection/tree/master/configs/htc
Mask Scoring R-CNN
- intro: CVPR 2019
- intro: Huazhong University of Science and Technology & Horizon Robotics Inc.
- arxiv: https://arxiv.org/abs/1903.00241
- github: https://github.com/zjhuang22/maskscoring_rcnn
TensorMask: A Foundation for Dense Object Segmentation
- intro: Facebook AI Research (FAIR)
- arxiv: https://arxiv.org/abs/1903.12174
Actor-Critic Instance Segmentation
- intro: CVPR 2019
- keywords: reinforcement learning
- arxiv: https://arxiv.org/abs/1904.05126
Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth
InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1908.07801
- github: https://github.com/GothicAi/Instaboost
SSAP: Single-Shot Instance Segmentation With Affinity Pyramid
- intro: ICCV 2019
- intro: Chinese Academy of Sciences & Horizon Robotics, Inc
- arxiv: https://arxiv.org/abs/1909.01616
YOLACT: Real-time Instance Segmentation
- intro: You Only Look At CoefficienTs
- intro: University of California, Davis
- keywords: one-stage, Fast NMS
- arxiv: https://arxiv.org/abs/1904.02689
- github(official, Pytorch): https://github.com/dbolya/yolact
YOLACT++: Better Real-time Instance Segmentation
https://arxiv.org/abs/1912.06218
YolactEdge: Real-time Instance Segmentation on the Edge
PolarMask: Single Shot Instance Segmentation with Polar Representation
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/1909.13226
- github: https://github.com/xieenze/PolarMask
PolarMask++: Enhanced Polar Representation for Single-Shot Instance Segmentation and Beyond
- intro: TPAMI 2021
- arxiv: https://arxiv.org/abs/2105.02184
- github: https://github.com/xieenze/PolarMask
CenterMask : Real-Time Anchor-Free Instance Segmentation
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/1911.06667
- github: https://github.com/youngwanLEE/CenterMask
- github: https://github.com/youngwanLEE/centermask2
CenterMask: single shot instance segmentation with point representation
- intro: CVPR 2020
- intro: Meituan Dianping Group
- arxiv: https://arxiv.org/abs/2004.04446
Shape-aware Feature Extraction for Instance Segmentation
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/1911.11263
PolyTransform: Deep Polygon Transformer for Instance Segmentation
https://arxiv.org/abs/1912.02801
EmbedMask: Embedding Coupling for One-stage Instance Segmentation
SAIS: Single-stage Anchor-free Instance Segmentation
https://arxiv.org/abs/1912.01176
SOLO: Segmenting Objects by Locations
- arxiv: https://arxiv.org/abs/1912.04488 -github: https://github.com/WXinlong/SOLO
SOLOv2: Dynamic, Faster and Stronger
SOLO: A Simple Framework for Instance Segmentation
RDSNet: A New Deep Architecture for Reciprocal Object Detection and Instance Segmentation
- intro: AAAI 2020
- intro: Chinese Academy of Sciences & 2Horizon Robotics Inc.
- arxiv: https://arxiv.org/abs/1912.05070
- github: https://github.com/wangsr126/RDSNet
BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
https://arxiv.org/abs/2001.00309
Conditional Convolutions for Instance Segmentation
- intro: ECCV 2020 oral
- intro: The University of Adelaide
- arxiv: https://arxiv.org/abs/2003.05664
- github: https://github.com/aim-uofa/adet
PointINS: Point-based Instance Segmentation
- intro: CUHK & MEGVII & Chinese Academy of Sciences & SmartMore
- arxiv: https://arxiv.org/abs/2003.06148
1st Place Solutions for OpenImage2019 – Object Detection and Instance Segmentation
https://arxiv.org/abs/2003.07557
Mask Encoding for Single Shot Instance Segmentation
- intro: CVPR 2020
- intro: Tongji University & University of Adelaide & Huawei Noah’s Ark Lab
- arxiv: https://arxiv.org/abs/2003.11712
The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation
Deep Variational Instance Segmentation
https://arxiv.org/abs/2007.11576
Mask Point R-CNN
https://arxiv.org/abs/2008.00460
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation
- intro: ACM MM 2020
- arxiv: https://arxiv.org/abs/2008.05676
- github: https://github.com/JialianW/Forest_RCNN
Seesaw Loss for Long-Tailed Instance Segmentation
https://arxiv.org/abs/2008.10032
Joint COCO and Mapillary Workshop at ICCV 2019: COCO Instance Segmentation Challenge Track
- intro: 1st Place Technical Report in ICCV2019/ ECCV2020: MegDetV2
- arxiv: https://arxiv.org/abs/2010.02475
DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation
- intro: Zhejiang University & Alibaba Group
- arxiv: https://arxiv.org/abs/2011.09876
The Devil is in the Boundary: Exploiting Boundary Representation for Basis-based Instance Segmentation
https://arxiv.org/abs/2011.13241
Robust Instance Segmentation through Reasoning about Multi-Object Occlusion
https://arxiv.org/abs/2012.02107
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
- intro: Google Research & UC Berkeley & Cornell University
- arxiv: https://arxiv.org/abs/2012.07177
How Shift Equivariance Impacts Metric Learning for Instance Segmentation
https://arxiv.org/abs/2101.05846
FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation
- intro: Nanyang Technological University & Carnegie Mellon Universit
- arxiv: https://arxiv.org/abs/2102.12867
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers
- intro: CVPR 2021
- arxiv: https://arxiv.org/abs/2103.12340
- github: https://github.com/lkeab/BCNet
- youtube: https://www.youtube.com/watch?v=iHlGJppJGiQ
- zhihu: https://zhuanlan.zhihu.com/p/378269087
Sparse Object-level Supervision for Instance Segmentation with Pixel Embeddings
FAPIS: A Few-shot Anchor-free Part-based Instance Segmenter
- intro: CVPR 2021
- arxiv: https://arxiv.org/abs/2104.00073
ISTR: End-to-End Instance Segmentation with Transformers
- arxiv: https://arxiv.org/abs/2105.00637
- github: https://github.com/hujiecpp/ISTR
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers
- intro: CVPR 2021
- intro: The Hong Kong University of Science and Technology & Kuaishou Technology
- keywords: BCNet
- arxiv: https://arxiv.org/abs/2103.12340
- github: https://github.com/lkeab/BCNet
SOLQ: Segmenting Objects by Learning Queries
- intro: MEGVII Technology
- arxiv: https://arxiv.org/abs/2106.02351
- github: https://github.com/megvii-research/SOLQ
1st Place Solution for YouTubeVOS Challenge 2021:Video Instance Segmentation
- intro: CPVR 2021 Workshop
- arxiv: https://arxiv.org/abs/2106.06649
Rank & Sort Loss for Object Detection and Instance Segmentation
- intro: ICCV 2021 Oral
- arxiv: https://arxiv.org/abs/2107.11669
- github: https://github.com/kemaloksuz/RankSortLoss
SOTR: Segmenting Objects with Transformers
- intro: ICCV 2021
- arxiv: https://arxiv.org/abs/2108.06747
- github: https://github.com/easton-cau/SOTR
FaPN: Feature-aligned Pyramid Network for Dense Image Prediction
- intro: ICCV 2021
- arxiv: https://arxiv.org/abs/2108.07058
- github: https://github.com/EMI-Group/FaPN
Instances as Queries
- intro: ICCV 2021
- intro: HUST & Tencent
- arxiv: https://arxiv.org/abs/2105.01928
- github: https://github.com/hustvl/QueryInst
Mask Transfiner for High-Quality Instance Segmentation
- intro: ETH Zurich & HKUST & Kuaishou Technology
- arixv: https://arxiv.org/abs/2111.13673
SOIT: Segmenting Objects with Instance-Aware Transformers
- intro: AAAI 2022
- arxiv: https://arxiv.org/abs/2112.11037
- github: https://github.com/yuxiaodongHRI/SOIT
ContrastMask: Contrastive Learning to Segment Every Thing
- intro: CVPR 2022
- arxiv: https://arxiv.org/abs/2203.09775
Sparse Instance Activation for Real-Time Instance Segmentation
- intro CVPR 2022
- intro: Huazhong University of Science & Technology & Horizon Robotics & CASIA
- arxiv: https://arxiv.org/abs/2203.12827
- github: https://github.com/hustvl/SparseInst
Human Instance Segmentation
PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model
- intro: Google, Inc.
- keywords: Person detection and pose estimation, segmentation and grouping
- arxiv: https://arxiv.org/abs/1803.08225
Pose2Seg: Detection Free Human Instance Segmentation
- intro: CVPR 2019
- intro: Tsinghua Unviersity & BNRist & Tencent AI Lab & Cardiff University
- keywords: Occluded Human (OCHuman)
- project page: http://www.liruilong.cn/Pose2Seg/index.html
- arxiv: https://arxiv.org/abs/1803.10683
- github: https://github.com/liruilong940607/Pose2Seg
- dataset: https://cg.cs.tsinghua.edu.cn/dataset/form.html?dataset=ochuman
Bounding Box Embedding for Single Shot Person Instance Segmentation
https://arxiv.org/abs/1807.07674
Parsing R-CNN for Instance-Level Human Analysis
- intro: COCO 2018 DensePose Challenge Winner
- arxiv: https://arxiv.org/abs/1811.12596
- github: https://github.com/soeaver/Parsing-R-CNN
Graphonomy: Universal Human Parsing via Graph Transfer Learning
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1904.04536
- github: https://github.com/Gaoyiminggithub/Graphonomy
Video Instance Segmentation
SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation
- intro: ECCV 2020
- arxiv: https://arxiv.org/abs/2007.14772
- github: https://github.com/JialeCao001/SipMask
End-to-End Video Instance Segmentation with Transformers
- intro: Meituan & The University of Adelaide
- arxiv: https://arxiv.org/abs/2011.14503
Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation
- intro: CVPR 2021
- intro: The HongKong Polytechnic University & DAMO Academy, Alibaba Group
- arxiv: https://arxiv.org/abs/2104.05606
- github: https://github.com/MinghanLi/STMask
Tracking Instances as Queries
- intro: HUST & Tencent PCG
- arxiv: https://arxiv.org/abs/2106.11963
Video Mask Transfiner for High-Quality Video Instance Segmentation
- intro: ECCV 2022
- intro: ETH Z¨urich & The Hong Kong University of Science and Technology & Kuaishou Technology
- arxiv: https://arxiv.org/abs/2207.14012
Panoptic Segmentation
Panoptic Segmentation
- intro: Facebook AI Research (FAIR) & Heidelberg University
- arxiv: https://arxiv.org/abs/1801.00868
- slides: http://presentations.cocodataset.org/COCO17-Invited-PanopticAlexKirillov.pdf
Panoptic Segmentation with a Joint Semantic and Instance Segmentation Network
https://arxiv.org/abs/1809.02110
Learning to Fuse Things and Stuff
- intro: Toyota Research Institute (TRI)
- keywords: TASCNet
- arxiv: https://arxiv.org/abs/1812.01192
Attention-guided Unified Network for Panoptic Segmentation
- intro: CVPR 2019
- intro: University of Chinese Academy of Sciences & Horizon Robotics, Inc. & The Johns Hopkins University
- arxiv: https://arxiv.org/abs/1812.03904
Panoptic Feature Pyramid Networks
- intro: FAIR
- arxiv: https://arxiv.org/abs/1901.02446
UPSNet: A Unified Panoptic Segmentation Network
- intro: Uber ATG & University of Toronto & The Chinese University of Hong Kong
- arxiv: https://arxiv.org/abs/1901.03784
Single Network Panoptic Segmentation for Street Scene Understanding
https://arxiv.org/abs/1902.02678
An End-to-End Network for Panoptic Segmentation
https://arxiv.org/abs/1903.05027
Learning Instance Occlusion for Panoptic Segmentation
https://arxiv.org/abs/1906.05896
SpatialFlow: Bridging All Tasks for Panoptic Segmentation
https://arxiv.org/abs/1910.08787
Single-Shot Panoptic Segmentation
https://arxiv.org/abs/1911.00764
SOGNet: Scene Overlap Graph Network for Panoptic Segmentation
- intro: AAAI 2020. Innovation Award in COCO 2019 challenge
- arxiv: https://arxiv.org/abs/1911.07527
Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation
- intro: UIUC & Google Research
- arxiv: https://arxiv.org/abs/1911.10194
PanDA: Panoptic Data Augmentation
https://arxiv.org/abs/1911.12317
Real-Time Panoptic Segmentation from Dense Detections
- intro: CVPR 2020 oral
- arxiv: https://arxiv.org/abs/1912.01202
- github: https://github.com/TRI-ML/realtime_panoptic
Bipartite Conditional Random Fields for Panoptic Segmentation
https://arxiv.org/abs/1912.05307
Unifying Training and Inference for Panoptic Segmentation
https://arxiv.org/abs/2001.04982
Towards Bounding-Box Free Panoptic Segmentation
- intro: SLAMcore Ltd. & Imperial College London
- arxiv: https://arxiv.org/abs/2002.07705
A Benchmark for LiDAR-based Panoptic Segmentation based on KITTI
- project page: http://semantic-kitti.org/
- arxiv: https://arxiv.org/abs/2003.02371
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
- intro: Johns Hopkins University & Google Research
- arxiv: https://arxiv.org/abs/2003.07853
EPSNet: Efficient Panoptic Segmentation Network with Cross-layer Attention Fusion
https://arxiv.org/abs/2003.10142
Pixel Consensus Voting for Panoptic Segmentation
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/2004.01849
EfficientPS: Efficient Panoptic Segmentation
Video Panoptic Segmentation
- intro: CVPR 2020 Oral
- intro: KAIST & Adobe Research
- arxiv: https://arxiv.org/abs/2006.11339
- github: https://github.com/mcahny/vps
PanoNet: Real-time Panoptic Segmentation through Position-Sensitive Feature Embedding
https://arxiv.org/abs/2008.00192
Robust Vision Challenge 2020 – 1st Place Report for Panoptic Segmentation
https://arxiv.org/abs/2008.10112
Learning Category- and Instance-Aware Pixel Embedding for Fast Panoptic Segmentation
- intro: Chinese Academy of Sciences & Horizon Robotics
- arxiv: https://arxiv.org/abs/2009.13342
Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation
- intro: NeurIPS 2020
- intro: Sun Yat-sen University & Huawei Noah’s Ark Lab & DarkMatter AI Research
- arxiv: https://arxiv.org/abs/2010.16119
- github: https://github.com/Jacobew/AutoPanoptic
Scaling Wide Residual Networks for Panoptic Segmentation
- intro: Google Research & Johns Hopkins University
- arxiv: https://arxiv.org/abs/2011.11675
Fully Convolutional Networks for Panoptic Segmentation
- intro: Chinese University of Hong Kong & University of Oxford & University of Hong Kong & MEGVII Technology4
- arxiv: https://arxiv.org/abs/2012.00720
- github: https://github.com/yanwei-li/PanopticFCN
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
- intro: Johns Hopkins University & Google Research
- arxiv: https://arxiv.org/abs/2012.00759
Ada-Segment: Automated Multi-loss Adaptation for Panoptic Segmentation
- intro: AAAI 2021
- intro: Sun Yat-Sen University & Huawei Noah’s Ark Lab & Shanghai Jiao Tong University
- arxiv: https://arxiv.org/abs/2012.03603
ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation
- intro: Johns Hopkins University & Google Research
- arxiv: https://arxiv.org/abs/2012.05258
- github: https://github.com/joe-siyuan-qiao/ViP-DeepLab
STEP: Segmenting and Tracking Every Pixel
- intro: Technical University Munich & Google Research & RWTH Aachen University & MPI-IS and University of Tubingen
- arxiv: https://arxiv.org/abs/2102.11859
Cross-View Regularization for Domain Adaptive Panoptic Segmentation
- intro: CVPR 2021 oral
- arxiv: https://arxiv.org/abs/2103.02584
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
- intro: Johns Hopkins University & Google Research
- arixv: https://arxiv.org/abs/2012.00759
Panoptic Segmentation Forecasting
- intro: CVPR 2021
- arxiv: https://arxiv.org/abs/2104.03962
Exemplar-Based Open-Set Panoptic Segmentation Network
- intro: CVPR 2021
- intro: Seoul National University & Adobe Research
- project page: https://cv.snu.ac.kr/research/EOPSN/
- arxiv: https://arxiv.org/abs/2105.08336
- github: https://github.com/jd730/EOPSN
Hierarchical Lovász Embeddings for Proposal-free Panoptic Segmentation
- intro: CVPR 2021
- arxiv: https://arxiv.org/abs/2106.04555
Part-aware Panoptic Segmentation
- intro: CVPR 2021
- arxiv: https://arxiv.org/abs/2106.06351
- github: https://github.com/tue-mps/panoptic_parts
Panoptic SegFormer
- intro: Nanjing University & The University of Hong Kong & NVIDIA & Caltech
- arxiv: https://arxiv.org/abs/2109.03814
Slot-VPS: Object-centric Representation Learning for Video Panoptic Segmentation
- intro: Samsung Research China - Beijing (SRC-B) & 2Samsung Advanced Institute of Technology (SAIT) & University of Oxford & The University of Hong Kong
- arxiv: https://arxiv.org/abs/2112.08949
CFNet: Learning Correlation Functions for One-Stage Panoptic Segmentation
- intro: Zhejiang University & Tencent Youtu Lab & Shanghai Jiao Tong University
- arxiv: https://arxiv.org/abs/2201.04796
Panoptic, Instance and Semantic Relations: A Relational Context Encoder to Enhance Panoptic Segmentation
- intro: CVPR 2022
- intro: Qualcomm AI Research
- arxiv: https://arxiv.org/abs/2204.05370
PanopticDepth: A Unified Framework for Depth-aware Panoptic Segmentation
- intro: CVPR 2022
- intro: Chinese Academy of Sciences & University of Chinese Academy of Sciences & Horizon Robotics, Inc.
- arxiv: https://arxiv.org/abs/2206.00468
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
- intro: CVPR 2022 Oral
- intro: Johns Hopkins University & KAIST & Google Research
- arxiv: https://arxiv.org/abs/2206.08948
Uncertainty-aware Panoptic Segmentation
- intro: Technical University Nurnberg
- arxiv: https://arxiv.org/abs/2206.14554
k-means Mask Transformer
- intro: ECCV 2022
- intro: Johns Hopkins University & Google Research
- arxiv: https://arxiv.org/abs/2207.04044
- github: https://github.com/google-research/deeplab2
Nightime Segmentation
Nighttime sky/cloud image segmentation
- intro: ICIP 2017
- arxiv: https://arxiv.org/abs/1705.10583
Dark Model Adaptation: Semantic Image Segmentation from Daytime to Nighttime
- intro: International Conference on Intelligent Transportation Systems (ITSC 2018)
- arxiv: https://arxiv.org/abs/1810.02575
Semantic Nighttime Image Segmentation with Synthetic Stylized Data, Gradual Adaptation and Uncertainty-Aware Evaluation
Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation
- intro: ICCV 2019
- intro: ETH Zurich & KU Leuven
- arxiv: https://arxiv.org/abs/1901.05946
Bi-Mix: Bidirectional Mixing for Domain Adaptive Nighttime Semantic Segmentation
DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation
- intro: CVPR 2021 oral
- intro: University of South Carolina & Farsee2 Technology Ltd
- arxiv: https://arxiv.org/abs/2104.10834
- github: https://github.com/W-zx-Y/DANNet
NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night
- intro: CVPR 2022
- arxiv: https://arxiv.org/abs/2204.05538
- github: https://github.com/xdeng7/NightLab
Boosting Night-time Scene Parsing with Learnable Frequency
- intro: Shanghai University & City University of Hong Kong & East China Normal University & Shanghai Jiao Tong University
- arxiv: https://arxiv.org/abs/2208.14241
Face Parsing
Face Parsing via Recurrent Propagation
- intro: BMVC 2017
- arxiv: https://arxiv.org/abs/1708.01936
Face Parsing via a Fully-Convolutional Continuous CRF Neural Network
https://arxiv.org/abs/1708.03736
Face Parsing with RoI Tanh-Warping
- intro: Software School of Xiamen University & Microsoft Research
- arxiv: https://arxiv.org/abs/1906.01342
End-to-End Face Parsing via Interlinked Convolutional Neural Networks
https://arxiv.org/abs/2002.04831
RoI Tanh-polar Transformer Network for Face Parsing in the Wild
Decoupled Multi-task Learning with Cyclical Self-Regulation for Face Parsing
- intro: CVPR 2022
- arxiv: https://arxiv.org/abs/2203.14448
- github: https://github.com/deepinsight/insightface/tree/master/parsing/dml_csr
Specific Segmentation
A CNN Cascade for Landmark Guided Semantic Part Segmentation
- project page: http://aaronsplace.co.uk/
- paper: https://aaronsplace.co.uk/papers/jackson2016guided/jackson2016guided.pdf
End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks
Boundary-sensitive Network for Portrait Segmentation
https://arxiv.org/abs/1712.08675
Boundary-Aware Network for Fast and High-Accuracy Portrait Segmentation
- intro: Zhejiang University
- arxiv: https://arxiv.org/abs/1901.03814
Beef Cattle Instance Segmentation Using Fully Convolutional Neural Network
- intro: BMVC 2018
- arxiv: https://arxiv.org/abs/1807.01972
Face Mask Extraction in Video Sequence
- keywords: ConvLSTM & FCN
- arxiv: https://arxiv.org/abs/1807.09207
Segment Proposal
Learning to Segment Object Candidates
- intro: Facebook AI Research (FAIR)
- intro: DeepMask. learning segmentation proposals
- arxiv: http://arxiv.org/abs/1506.06204
- github: https://github.com/facebookresearch/deepmask
- github: https://github.com/abbypa/NNProject_DeepMask
Learning to Refine Object Segments
- intro: ECCV 2016. Facebook AI Research (FAIR)
- intro: SharpMask. an extension of DeepMask which generates higher-fidelity masks using an additional top-down refinement step.
- arxiv: http://arxiv.org/abs/1603.08695
- github: https://github.com/facebookresearch/deepmask
FastMask: Segment Object Multi-scale Candidates in One Shot
- intro: CVPR 2017. University of California & Fudan University & Megvii Inc.
- arxiv: https://arxiv.org/abs/1612.08843
- github: https://github.com/voidrank/FastMask
Scene Labeling / Scene Parsing
Indoor Semantic Segmentation using depth information
Recurrent Convolutional Neural Networks for Scene Parsing
- arxiv: http://arxiv.org/abs/1306.2795
- slides: http://people.ee.duke.edu/~lcarin/Yizhe8.14.2015.pdf
- github: https://github.com/NP-coder/CLPS1520Project
- github: https://github.com/rkargon/Scene-Labeling
Learning hierarchical features for scene labeling
Multi-modal unsupervised feature learning for rgb-d scene labeling
- intro: ECCV 2014
- paper: http://www3.ntu.edu.sg/home/wanggang/WangECCV2014.pdf
Scene Labeling with LSTM Recurrent Neural Networks
Attend, Infer, Repeat: Fast Scene Understanding with Generative Models
- arxiv: http://arxiv.org/abs/1603.08575
- notes: http://www.shortscience.org/paper?bibtexKey=journals/corr/EslamiHWTKH16
“Semantic Segmentation for Scene Understanding: Algorithms and Implementations” tutorial
- intro: 2016 Embedded Vision Summit
- youtube: https://www.youtube.com/watch?v=pQ318oCGJGY
Semantic Understanding of Scenes through the ADE20K Dataset
Learning Deep Representations for Scene Labeling with Guided Supervision
Learning Deep Representations for Scene Labeling with Semantic Context Guided Supervision
- intro: CUHK
- arxiv: https://arxiv.org/abs/1706.02493
Spatial As Deep: Spatial CNN for Traffic Scene Understanding
- intro: AAAI 2018
- arxiv: https://arxiv.org/abs/1712.06080
Multi-Path Feedback Recurrent Neural Network for Scene Parsing
Scene Labeling using Recurrent Neural Networks with Explicit Long Range Contextual Dependency
FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation
- intro: CVPR 2022
- arxiv: https://arxiv.org/abs/2204.01587
PSPNet
Pyramid Scene Parsing Network
- intro: CVPR 2017
- intro: mIoU score as 85.4% on PASCAL VOC 2012 and 80.2% on Cityscapes, ranked 1st place in ImageNet Scene Parsing Challenge 2016
- project page: http://appsrv.cse.cuhk.edu.hk/~hszhao/projects/pspnet/index.html
- arxiv: https://arxiv.org/abs/1612.01105
- slides: http://image-net.org/challenges/talks/2016/SenseCUSceneParsing.pdf
- github: https://github.com/hszhao/PSPNet
- github: https://github.com/Vladkryvoruchko/PSPNet-Keras-tensorflow
Open Vocabulary Scene Parsing
https://arxiv.org/abs/1703.08769
Deep Contextual Recurrent Residual Networks for Scene Labeling
https://arxiv.org/abs/1704.03594
Fast Scene Understanding for Autonomous Driving
- intro: Published at “Deep Learning for Vehicle Perception”, workshop at the IEEE Symposium on Intelligent Vehicles 2017
- arxiv: https://arxiv.org/abs/1708.02550
FoveaNet: Perspective-aware Urban Scene Parsing
https://arxiv.org/abs/1708.02421
BlitzNet: A Real-Time Deep Network for Scene Understanding
- intro: INRIA
- arxiv: https://arxiv.org/abs/1708.02813
Semantic Foggy Scene Understanding with Synthetic Data
https://arxiv.org/abs/1708.07819
Scale-adaptive Convolutions for Scene Parsing
- intro: ICCV 2017
- paper: http://openaccess.thecvf.com/content_ICCV_2017/papers/Zhang_Scale-Adaptive_Convolutions_for_ICCV_2017_paper.pdf
Restricted Deformable Convolution based Road Scene Semantic Segmentation Using Surround View Cameras
https://arxiv.org/abs/1801.00708
Dense Recurrent Neural Networks for Scene Labeling
https://arxiv.org/abs/1801.06831
DenseASPP for Semantic Segmentation in Street Scenes
- intro: CVPR 2018
- paper: http://openaccess.thecvf.com/content_cvpr_2018/papers/Yang_DenseASPP_for_Semantic_CVPR_2018_paper.pdf
- github: https://github.com/DeepMotionAIResearch/DenseASPP
OCNet: Object Context Network for Scene Parsing
- intro: Microsoft Research
- arxiv: https://arxiv.org/abs/1809.00916
- github: https://github.com/PkuRainBow/OCNet
PSANet: Point-wise Spatial Attention Network for Scene Parsing
- intro: ECCV 2018
- project page: https://hszhao.github.io/projects/psanet/
- paper: https://hszhao.github.io/papers/eccv18_psanet.pdf
- slides: https://docs.google.com/presentation/d/1_brKNBtv8nVu_jOwFRGwVkEPAq8B8hEngBSQuZCWaZA/edit#slide=id.p
- github: https://github.com/hszhao/PSANet
Adaptive Context Network for Scene Parsing
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1911.01664
Semantic Flow for Fast and Accurate Scene Parsing
- intro: ECCV 2020 oral
- arxiv: https://arxiv.org/abs/2002.10120
- github: https://github.com/donnyyou/torchcv
Strip Pooling: Rethinking Spatial Pooling for Scene Parsing
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/2003.13328
- github: https://github.com/Andrew-Qibin/SPNet
S3-Net: A Fast and Lightweight Video Scene Understanding Network by Single-shot Segmentation
- intro: WACV 2021
- arxiv: https://arxiv.org/abs/2011.02265
Benchmarks
MIT Scene Parsing Benchmark
- homepage: http://sceneparsing.csail.mit.edu/
- github(devkit): https://github.com/CSAILVision/sceneparsing
Semantic Understanding of Urban Street Scenes: Benchmark Suite
https://www.cityscapes-dataset.com/benchmarks/
Challenges
Large-scale Scene Understanding Challenge
- homepage: http://lsun.cs.princeton.edu/
Places2 Challenge
http://places2.csail.mit.edu/challenge.html
Human Parsing
Human Parsing with Contextualized Convolutional Neural Network
- intro: ICCV 2015
- paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/html/Liang_Human_Parsing_With_ICCV_2015_paper.html
Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing
- intro: CVPR 2017. SYSU & CMU
- keywords: Look Into Person (LIP)
- project page: http://hcp.sysu.edu.cn/lip/
- arxiv: https://arxiv.org/abs/1703.05446
- github: https://github.com/Engineering-Course/LIP_SSL
Multiple-Human Parsing in the Wild
https://arxiv.org/abs/1705.07206
Look into Person: Joint Body Parsing & Pose Estimation Network and A New Benchmark
- intro: T-PAMI 2018
- keywords: Joint Body Parsing & Pose Estimation Network (JPPNet)
- arxiv: https://arxiv.org/abs/1804.01984
- github: https://github.com/Engineering-Course/LIP_JPPNet
Cross-domain Human Parsing via Adversarial Feature and Label Adaptation
- intro: AAAI 2018
- arxiv: https://arxiv.org/abs/1801.01260
Fusing Hierarchical Convolutional Features for Human Body Segmentation and Clothing Fashion Classification
- intro: Wuhan University
- arxiv: https://arxiv.org/abs/1803.03415
Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing
Macro-Micro Adversarial Network for Human Parsing
- intro: ECCV 2018
- keywords: Macro-Micro Adversarial Net (MMAN)
- arxiv: https://arxiv.org/abs/1807.08260
- github: https://github.com/RoyalVane/MMAN
Instance-level Human Parsing via Part Grouping Network
- intro: ECCV 2018 Oral
- arxiv: https://arxiv.org/abs/1808.00157
Adaptive Temporal Encoding Network for Video Instance-level Human Parsing
- intro: ACM MM 2018 = arixv: https://arxiv.org/abs/1808.00661
- github(official, TensorFlow): https://github.com/HCPLab-SYSU/ATEN
Devil in the Details: Towards Accurate Single and Multiple Human Parsing
- keywords: Context Embedding with Edge Perceiving (CE2P)
- arxiv: https://arxiv.org/abs/1809.05996
- github: https://github.com/liutinglt/CE2P
Cross-Domain Complementary Learning with Synthetic Data for Multi-Person Part Segmentation
- intro: University of Washington & Microsof
- arxiv: https://arxiv.org/abs/1907.05193
Self-Correction for Human Parsing
- arxiv: https://arxiv.org/abs/1910.09777
- github: https://github.com/PeikeLi/Self-Correction-Human-Parsing
Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing
- intro: AAAI 2020
- arxiv: https://arxiv.org/abs/1911.12053
- github: https://github.com/Charleshhy/Grapy-ML
Learning Semantic Neural Tree for Human Parsing
- intro: Institute of Software Chinese Academy of Sciences & State University of New York & JD Finance America Corporation & Tencent Youtu Lab
- arxiv: https://arxiv.org/abs/1912.09622
- code: https://isrc.iscas.ac.cn/gitlab/research/sematree
Self-Learning with Rectification Strategy for Human Parsing
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/2004.08055
Correlating Edge, Pose with Parsing
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/2005.01431
- github: https://github.com/ziwei-zh/CorrPM
Affinity-aware Compression and Expansion Network for Human Parsing
https://arxiv.org/abs/2008.10191
Renovating Parsing R-CNN for Accurate Multiple Human Parsing
- intro: ECCV 2020
- intro: BUPT & Noah’s Ark Lab, Huawei Technologies
- arxiv: https://arxiv.org/abs/2009.09447
- github: https://github.com/soeaver/RP-R-CNN
Progressive One-shot Human Parsing
- intro: AAAI 2021
- arxiv: https://arxiv.org/abs/2012.11810
- github: https://github.com/Charleshhy/One-shot-Human-Parsing
Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
- intro: CVPR 2021 oral
- arxiv: https://arxiv.org/abs/2103.04570
- github: https://github.com/tfzhou/MG-HumanParsing
Quality-Aware Network for Human Parsing
- intro: BUPT & Institute of Automation Chinese Academy of Sciences & 3Noah’s Ark Lab
- arxiv: https://arxiv.org/abs/2103.05997
- github(Pytorch): https://github.com/soeaver/QANet
End-to-end One-shot Human Parsing
https://arxiv.org/abs/2105.01241
CDGNet: Class Distribution Guided Network for Human Parsing
- intro: Ajou University & Tiangong University & Incheon National University
- arxiv: https://arxiv.org/abs/2111.14173
AIParsing: Anchor-free Instance-level Human Parsing
- intro: IEEE Transactions on Image Processing (TIP)
- arxiv: https://arxiv.org/abs/2207.06854
RepParser: End-to-End Multiple Human Parsing with Representative Parts
- intro: Center for Future Media & University of Electronic Science and Technology of China
- arxiv: https://arxiv.org/abs/2208.12908
Joint Detection and Segmentation
Triply Supervised Decoder Networks for Joint Detection and Segmentation
https://arxiv.org/abs/1809.09299
D2Det: Towards High Quality Object Detection and Instance Segmentation
- intro: CVPR 2020
- paper: https://openaccess.thecvf.com/content_CVPR_2020/papers/Cao_D2Det_Towards_High_Quality_Object_Detection_and_Instance_Segmentation_CVPR_2020_paper.pdf
- github: https://github.com/JialeCao001/D2Det
Video Object Segmentation
Fast object segmentation in unconstrained video
- project page: http://calvin.inf.ed.ac.uk/software/fast-video-segmentation/
- paper: http://calvin.inf.ed.ac.uk/wp-content/uploads/Publications/papazoglouICCV2013-camera-ready.pdf
Recurrent Fully Convolutional Networks for Video Segmentation
Object Detection, Tracking, and Motion Segmentation for Object-level Video Segmentation
Clockwork Convnets for Video Semantic Segmentation
- intro: ECCV 2016 Workshops
- intro: evaluated on the Youtube-Objects, NYUD, and Cityscapes video datasets
- arxiv: http://arxiv.org/abs/1608.03609
- github: https://github.com/shelhamer/clockwork-fcn
STFCN: Spatio-Temporal FCN for Semantic Video Segmentation
One-Shot Video Object Segmentation
- intro: OSVOS
- project: http://www.vision.ee.ethz.ch/~cvlsegmentation/osvos/
- arxiv: https://arxiv.org/abs/1611.05198
- github(official): https://github.com/kmaninis/OSVOS-caffe
- github(official): https://github.com/scaelles/OSVOS-TensorFlow
- github(official): https://github.com/kmaninis/OSVOS-PyTorch
DAVIS: Densely Annotated VIdeo Segmentation
- homepage: http://davischallenge.org/
- arxiv: https://arxiv.org/abs/1704.00675
Video Object Segmentation Without Temporal Information
https://arxiv.org/abs/1709.06031
Convolutional Gated Recurrent Networks for Video Segmentation
Learning Video Object Segmentation from Static Images
Semantic Video Segmentation by Gated Recurrent Flow Propagation
FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos
- project page: http://vision.cs.utexas.edu/projects/fusionseg/
- arxiv: https://arxiv.org/abs/1701.05384
- github: https://github.com/suyogduttjain/fusionseg
Unsupervised learning from video to detect foreground objects in single images
https://arxiv.org/abs/1703.10901
Semantically-Guided Video Object Segmentation
https://arxiv.org/abs/1704.01926
Learning Video Object Segmentation with Visual Memory
https://arxiv.org/abs/1704.05737
Flow-free Video Object Segmentation
https://arxiv.org/abs/1706.09544
Online Adaptation of Convolutional Neural Networks for Video Object Segmentation
https://arxiv.org/abs/1706.09364
Video Object Segmentation using Tracked Object Proposals
- intro: CVPR-2017 workshop, DAVIS-2017 Challenge
- arxiv: https://arxiv.org/abs/1707.06545
Video Object Segmentation with Re-identification
- intro: CVPR 2017 Workshop, DAVIS Challenge on Video Object Segmentation 2017 (Winning Entry)
- arxiv: https://arxiv.org/abs/1708.00197
- github(official, PyTorch): https://github.com/lxx1991/VS-ReID
Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks
- intro: ICCV 2017
- arxiv: https://arxiv.org/abs/1708.05137
MaskRNN: Instance Level Video Object Segmentation
- intro: NIPS 2017
- arxiv: https://arxiv.org/abs/1803.11187
SegFlow: Joint Learning for Video Object Segmentation and Optical Flow
- project page: https://sites.google.com/site/yihsuantsai/research/iccv17-segflow
- arxiv: https://arxiv.org/abs/1709.06750
- github: https://github.com/JingchunCheng/SegFlow
Video Semantic Object Segmentation by Self-Adaptation of DCNN
https://arxiv.org/abs/1711.08180
Learning to Segment Moving Objects
https://arxiv.org/abs/1712.01127
Instance Embedding Transfer to Unsupervised Video Object Segmentation
- intro: University of Southern California & Google Inc
- arxiv: https://arxiv.org/abs/1801.00908
- blog: https://medium.com/@barvinograd1/instance-embedding-instance-segmentation-without-proposals-31946a7c53e1
Efficient Video Object Segmentation via Network Modulation
- intro: Snap Inc. & Northwestern University & Google Inc.
- arxiv: https://arxiv.org/abs/1802.01218
Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation
- intro: ECCV 2018
- intro: CUHK
- keywords: DyeNet
- arxiv: https://arxiv.org/abs/1803.04242
Video Object Segmentation with Language Referring Expressions
https://arxiv.org/abs/1803.08006
Dynamic Video Segmentation Network
- intro: CVPR 2018
- keywords: DVSNet
- arxiv: https://arxiv.org/abs/1804.00931
- github: https://github.com/XUSean0118/DVSNet
Low-Latency Video Semantic Segmentation
- intro: CVPR 2018 Spotlight
- arxiv: https://arxiv.org/abs/1804.00389
Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1804.03131
Unsupervised Video Object Segmentation for Deep Reinforcement Learning
- intro: University of Waterloo
- arxiv: https://arxiv.org/abs/1805.07780
Fast and Accurate Online Video Object Segmentation via Tracking Parts
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1806.02323
- github: https://github.com/JingchunCheng/FAVOS
ReConvNet: Video Object Segmentation with Spatio-Temporal Features Modulation
- intro: CVPR Workshop - DAVIS Challenge 2018
- arxiv: https://arxiv.org/abs/1806.05510
Deep Spatio-Temporal Random Fields for Efficient Video Segmentation
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1807.03148
Fast Video Object Segmentation by Reference-Guided Mask Propagation
- intro: CVPR 2018
- paper: http://openaccess.thecvf.com/content_cvpr_2018/CameraReady/1029.pdf
- github: https://github.com/seoungwugoh/RGMP
PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation
https://arxiv.org/abs/1807.09190
YouTube-VOS: Sequence-to-Sequence Video Object Segmentation
- intro: ECCV 2018. Adobe Research & Snapchat Research & UIUC
- project page:https://youtube-vos.org/
- arxiv: https://arxiv.org/abs/1809.00461
VideoMatch: Matching based Video Object Segmentation
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1809.01123
Mask Propagation Network for Video Object Segmentation
- intro: ByteDance AI Lab
- arxiv: https://arxiv.org/abs/1810.10289
Tukey-Inspired Video Object Segmentation
https://arxiv.org/abs/1811.07958
A Generative Appearance Model for End-to-end Video Object Segmentation
https://arxiv.org/abs/1811.11611
Unseen Object Segmentation in Videos via Transferable Representations
- intro: ACCV 2018 oral
- arxiv: https://arxiv.org/abs/1901.02444
- github: https://github.com/wenz116/TransferSeg
FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation
- intro: CVPR 2019
- intro: RWTH Aachen University & Google Inc.
- arxiv: https://arxiv.org/abs/1902.09513
RVOS: End-to-End Recurrent Network for Video Object Segmentation
- intro: CVPR 2019
- project page: https://imatge-upc.github.io/rvos/
- arxiv: https://arxiv.org/abs/1903.05612
BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames
- intro: CVPR 2019
- intro: University of Michigan
- arxiv: https://arxiv.org/abs/1903.11779
- github: https://github.com/griffbr/BubbleNets
- video: https://www.youtube.com/watch?v=0kNmm8SBnnU&feature=youtu.be
Fast video object segmentation with Spatio-Temporal GANs
https://arxiv.org/abs/1903.12161
Video Object Segmentation using Space-Time Memory Networks
- intro: ICCV 2019
- intro: Yonsei University & Adobe Research
- arxiv: https://arxiv.org/abs/1904.00607
- github: https://github.com/seoungwugoh/STM
Spatiotemporal CNN for Video Object Segmentation
[https://arxiv.org/abs/1904.02363]
Architecture Search of Dynamic Cells for Semantic Video Segmentation
https://arxiv.org/abs/1904.02371
BoLTVOS: Box-Level Tracking for Video Object Segmentation
https://arxiv.org/abs/1904.04552
MAIN: Multi-Attention Instance Network for Video Segmentation
https://arxiv.org/abs/1904.05847
MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation
- intro: CVPR 2019 oral
- arxiv: https://arxiv.org/abs/1904.08141
Video Instance Segmentation
- intro: ICCV 2019
- intro: ByteDance AI Lab & UIUC & Adobe Research
- keywords: MaskTrack R-CNN
- arxiv: https://arxiv.org/abs/1905.04804
- github: https://github.com/youtubevos/MaskTrackRCNN
OVSNet : Towards One-Pass Real-Time Video Object Segmentation
- intro: Zhejiang University & SenseTime Research & Tianjin University]
- arxiv: https://arxiv.org/abs/1905.10064
Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation
- intro: Huazhong University of Science and Technology & Horizon Robotics
- arxiv: https://arxiv.org/abs/1907.01203
- github: https://github.com/sydney0zq/PTSNet
RANet: Ranking Attention Network for Fast Video Object Segmentation
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1908.06647
- github: https://github.com/Storife/RANet
DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1909.12471
CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1910.00132
Towards Good Practices for Video Object Segmentation
- intro: ByteDance AI Lab
- arxiv: https://arxiv.org/abs/1909.13583
Anchor Diffusion for Unsupervised Video Object Segmentation
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1910.10895
Learning a Spatio-Temporal Embedding for Video Instance Segmentation
- intro: University of Cambridge
- arxiv: https://arxiv.org/abs/1912.08969
Efficient Semantic Video Segmentation with Per-frame Inference
- intro: ECCV 2020
- intro: The University of Adelaide & Huazhong University of Science and Technology & Microsoft Research
- arxiv: https://arxiv.org/abs/2002.11433
- github: https://github.com/irfanICMLL/ETC-Real-time-Per-frame-Semantic-video-segmentation
State-Aware Tracker for Real-Time Video Object Segmentation
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/2003.00482
- github: https://github.com/MegviiDetection/video_analyst
Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement
- intro: NeurIPS 2020
- arxiv: https://arxiv.org/abs/2010.07958
SwiftNet: Real-time Video Object Segmentation
https://arxiv.org/abs/2102.04604
SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation
https://arxiv.org/abs/2103.10284
Challenge
DAVIS Challenge on Video Object Segmentation 2017
http://davischallenge.org/challenge2017/publications.html
Matting
Deep Image Matting
- intro: CVPR 2017
- intro: Beckman Institute for Advanced Science and Technology & Adobe Research
- project page: https://sites.google.com/view/deepimagematting
- arxiv: https://arxiv.org/abs/1703.03872
- github(unofficial): https://github.com/open-mmlab/mmediting/tree/master/configs/mattors/dim
- github(unofficial): https://github.com/foamliu/Deep-Image-Matting
- github(unofficial): https://github.com/foamliu/Deep-Image-Matting-PyTorch
- github(unofficial): https://github.com/huochaitiantang/pytorch-deep-image-matting
Fast Deep Matting for Portrait Animation on Mobile Phone
- intro: ACM Multimedia Conference (MM) 2017
- intro: does not need any interaction and can realize real-time matting with 15 fps
- arxiv: https://arxiv.org/abs/1707.08289
Real-time deep hair matting on mobile devices
- intro: ModiFace Inc, University of Toronto
- arxiv: https://arxiv.org/abs/1712.07168
TOM-Net: Learning Transparent Object Matting from a Single Image
- intro: CVPR 2018
- project page: http://gychen.org/TOM-Net/
- arxiv: https://arxiv.org/abs/1803.04636
- github: https://github.com/guanyingc/TOM-Net
Deep Video Portraits
- intro: SIGGRAPH 2018
- arxiv: https://arxiv.org/abs/1805.11714
- youtube: https://www.youtube.com/watch?v=qc5P2bvfl44
Inductive Guided Filter: Real-time Deep Image Matting with Weakly Annotated Masks on Mobile Devices
- intro: Shanghai Jiao Tong University & Versa
- arxiv: https://arxiv.org/abs/1905.06747
Indices Matter: Learning to Index for Deep Image Matting
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1908.00672
- github(official): https://github.com/poppinace/indexnet_matting
- github: https://github.com/open-mmlab/mmediting/tree/master/configs/mattors/indexnet
Disentangled Image Matting
https://arxiv.org/abs/1909.04686
Natural Image Matting via Guided Contextual Attention
- intro: AAAI 2020
- arxiv: https://arxiv.org/abs/2001.04069
- github: https://github.com/Yaoyi-Li/GCA-Matting
F, B, Alpha Matting
- intro: ECCV 2020
- arxiv: https://arxiv.org/abs/2003.07711
- github: https://github.com/MarcoForte/FBA_Matting
Background Matting: The World is Your Green Screen
- intro: CVPR 2020
- intro: University of Washington
- project page: https://grail.cs.washington.edu/projects/background-matting/
- arxiv: https://arxiv.org/abs/2004.00626
- github: https://github.com/senguptaumd/Background-Matting
- blog: https://towardsdatascience.com/background-matting-the-world-is-your-green-screen-83a3c4f0f635
Hierarchical Opacity Propagation for Image Matting
- intro: Shanghai Jiao Tong University
- arxiv: https://arxiv.org/abs/2004.03249
- github: https://github.com/Yaoyi-Li/HOP-Matting
High-Resolution Deep Image Matting
- intro: UIUC & Adobe Research & University of Oregon
- arxiv: https://arxiv.org/abs/2009.06613
Learning Affinity-Aware Upsampling for Deep Image Matting
- intro: The University of Adelaide & Huazhong University of Science and Technology
- arxiv: https://arxiv.org/abs/2011.14288
Real-Time High-Resolution Background Matting
- project page: https://grail.cs.washington.edu/projects/background-matting-v2/
- arxiv: https://arxiv.org/abs/2012.07810
- github: https://github.com/PeterL1n/BackgroundMattingV2
Deep Video Matting via Spatio-Temporal Alignment and Aggregation
- intro: CVPR 2021
- arxiv: https://arxiv.org/abs/2104.11208
- github: https://github.com/nowsyn/DVM
Trimap-guided Feature Mining and Fusion Network for Natural Image Matting
- intro: Shanghai Jiao Tong University & ByteDance Inc.
- arxiv: https://arxiv.org/abs/2112.00510
Boosting Robustness of Image Matting with Context Assembling and Strong Data Augmentation
- intro: The University of Adelaide & Adobe Inc. & Zhejiang University
- arxiv: https://arxiv.org/abs/2201.06889
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
- intro: Seoul National University & NAVER WEBTOON AI
- arxiv: https://arxiv.org/abs/2203.15662
Referring Image Matting
- intro: The University of Sydney & JD Explore Academy
- arxiv: https://arxiv.org/abs/2206.05149
- github: https://github.com/JizhiziLi/RIM
One-Trimap Video Matting
- intro: ECCV 2022
- arxiv: https://arxiv.org/abs/2207.13353
- github: https://github.com/Hongje/OTVM
TransMatting: Enhancing Transparent Objects Matting with Transformers
- intro: ECCV 2022
- arxiv: https://arxiv.org/abs/2208.03007
- github: https://github.com/AceCHQ/TransMatting
trimap-free matting
Semantic Human Matting
- intro: ACM Multimedia 2018
- arxiv: https://arxiv.org/abs/1809.01354
- github(unofficial): https://github.com/lizhengwei1992/Semantic_Human_Matting
Instance Segmentation based Semantic Matting for Compositing Applications
- intro: CRV 2019
- arxiv: https://arxiv.org/abs/1904.05457
A Late Fusion CNN for Digital Matting
- intro: CVPR 2019
- intro: Zhejiang University & Alibaba Group & University of Texas at Austin
- paper: https://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_A_Late_Fusion_CNN_for_Digital_Matting_CVPR_2019_paper.pdf
- github(official, Keras): https://github.com/yunkezhang/FusionMatting
Attention-Guided Hierarchical Structure Aggregation for Image Matting
- intro: CVPR 2020
- project page: https://wukaoliu.github.io/HAttMatting/
- paper: https://openaccess.thecvf.com/content_CVPR_2020/papers/Qiao_Attention-Guided_Hierarchical_Structure_Aggregation_for_Image_Matting_CVPR_2020_paper.pdf
- github: https://github.com/wukaoliu/CVPR2020-HAttMatting
Boosting Semantic Human Matting with Coarse Annotations
- intro: Alibaba Group & Tsinghua University
- arxiv: https://arxiv.org/abs/2004.04955
End-to-end Animal Image Matting
- keywords: Glance and Focus Matting network (GFM), AM-2k dataset, BG-20k dataset
- arxiv: https://arxiv.org/abs/2010.16188
- github: https://github.com/JizhiziLi/animal-matting/
Is a Green Screen Really Necessary for Real-Time Human Matting?
- intro: City University of Hong Kong & SenseTime Research
- arxiv: https://arxiv.org/abs/2011.11961
- github: https://github.com/ZHKKKe/MODNet
Multi-scale Information Assembly for Image Matting
https://arxiv.org/abs/2101.02391
Salient Image Matting
- intro: Fynd & University of Michigan
- arxiv: https://arxiv.org/abs/2103.12337
Mask Guided Matting via Progressive Refinement Network
- intro: CVPR 2021
- intro: The Johns Hopkins University & Adobe
- arxiv: https://arxiv.org/abs/2012.06722
- github: https://github.com/yucornetto/MGMatting
Privacy-Preserving Portrait Matting
- intro: The University of Sydney & JD Explore Academy
- arxiv: https://arxiv.org/abs/2104.14222
- github: https://github.com/SHI-Labs/Pseudo-IoU-for-Anchor-Free-Object-Detection
Highly Efficient Natural Image Matting
- intro: BMVC 2021
- arxiv: https://arxiv.org/abs/2110.12748
PP-HumanSeg: Connectivity-Aware Portrait Segmentation with a Large-Scale Teleconferencing Video Dataset
- intro: WACV 2021 workshop
- intro: Baidu, Inc.
- arxiv: https://arxiv.org/abs/2112.07146
- github: https://github.com/PaddlePaddle/PaddleSeg
Situational Perception Guided Image Matting
- intro: OPPO Research Institute & PicUp.AI & Xmotors
- arxiv: https://arxiv.org/abs/2204.09276
PP-Matting: High-Accuracy Natural Image Matting
- intro: Baidu Inc.
- arixv: https://arxiv.org/abs/2204.09433
- github: https://github.com/PaddlePaddle/PaddleSeg
VMFormer: End-to-End Video Matting with Transformer
- project page: https://chrisjuniorli.github.io/project/VMFormer/
- intro: University of Oregon & UIUC & BJTU & Picsart AI Research (PAIR)
- arxiv: https://arxiv.org/abs/2208.12801
- gihtub: https://github.com/SHI-Labs/VMFormer
3D Segmentation
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
- intro: Stanford University
- project page: http://stanford.edu/~rqi/pointnet/
- arxiv: https://arxiv.org/abs/1612.00593
- github: https://github.com/charlesq34/pointnet
DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks
https://arxiv.org/abs/1703.03098
SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud
- intro: UC Berkeley
- arxiv: https://arxiv.org/abs/1710.07368
SEGCloud: Semantic Segmentation of 3D Point Clouds
- intro: International Conference of 3D Vision (3DV) 2017 (Spotlight). Stanford University
- homepage: http://segcloud.stanford.edu/
- arxiv: https://arxiv.org/abs/1710.07563
3D Instance Segmentation via Multi-task Metric Learning
- intro: KAUST & ETH Zurich
- arxiv: https://arxiv.org/abs/1906.08650
3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation
- intro: RWTH Aachen University & Google & Technical University Munich
- project page: https://www.vision.rwth-aachen.de/publication/00199/
- arxiv: https://arxiv.org/abs/2003.13867
PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
- intro: CVPR 2020
- arxiv: https://arxiv.org/abs/2004.01658
Line Parsing
Fully Convolutional Line Parsing
- intro: ICCV 2021
- intro: UESTC & UC Berkeley
- arxiv: https://arxiv.org/abs/2104.11207
- github(PyTorch): https://github.com/Delay-Xili/F-Clip
Projects
TF Image Segmentation: Image Segmentation framework
- intro: Image Segmentation framework based on Tensorflow and TF-Slim library
- github: https://github.com/warmspringwinds/tf-image-segmentation
KittiSeg: A Kitti Road Segmentation model implemented in tensorflow.
- keywords: MultiNet
- intro: KittiSeg performs segmentation of roads by utilizing an FCN based model.
- github: https://github.com/MarvinTeichmann/KittiBox
Semantic Segmentation Architectures Implemented in PyTorch
- intro: Segnet/FCN/U-Net/Link-Net
- github: https://github.com/meetshah1995/pytorch-semseg
PyTorch for Semantic Segmentation
https://github.com/ZijunDeng/pytorch-semantic-segmentation
LightNet: Light-weight Networks for Semantic Image Segmentation
- project page: https://ansleliu.github.io/LightNet.html
- github: https://github.com/ansleliu/LightNet
LightNet++: Boosted Light-weighted Networks for Real-time Semantic Segmentation
- project page: https://ansleliu.github.io/LightNet.html
- github: https://github.com/ansleliu/LightNetPlusPlus
Leaderboard
Segmentation Results: VOC2012 BETA: Competition “comp6” (train on own data)
http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?cls=mean&challengeid=11&compid=6
Blogs
Mobile Real-time Video Segmentation
https://research.googleblog.com/2018/03/mobile-real-time-video-segmentation.html
Deep Learning for Natural Image Segmentation Priors
http://cs.brown.edu/courses/csci2951-t/finals/ghope/
Image Segmentation Using DIGITS 5
https://devblogs.nvidia.com/parallelforall/image-segmentation-using-digits-5/
Image Segmentation with Tensorflow using CNNs and Conditional Random Fields http://warmspringwinds.github.io/tensorflow/tf-slim/2016/12/18/image-segmentation-with-tensorflow-using-cnns-and-conditional-random-fields/
Fully Convolutional Networks (FCNs) for Image Segmentation
- blog: http://warmspringwinds.github.io/tensorflow/tf-slim/2017/01/23/fully-convolutional-networks-(fcns)-for-image-segmentation/
- ipn: https://github.com/warmspringwinds/tensorflow_notes/blob/master/fully_convolutional_networks.ipynb
Image segmentation with Neural Net
- blog: https://medium.com/@m.zaradzki/image-segmentation-with-neural-net-d5094d571b1e#.s5f711g1q
- github: https://github.com/mzaradzki/neuralnets/tree/master/vgg_segmentation_keras
A 2017 Guide to Semantic Segmentation with Deep Learning
http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review
Tutorails / Talks
A Unified Architecture for Instance and Semantic Segmentation
- intro: FPN
- slides: http://presentations.cocodataset.org/COCO17-Stuff-FAIR.pdf
Deep learning for image segmentation
- intro: PyData Warsaw - Mateusz Opala & Michał Jamroż
- youtube: https://www.youtube.com/watch?v=W6r_a5crqGI