Segmentation

Published: 09 Oct 2015 Category: deep_learning

Papers

Deep Joint Task Learning for Generic Object Extraction

intro: NIPS 2014
homepage: http://vision.sysu.edu.cn/projects/deep-joint-task-learning/
paper: http://ss.sysu.edu.cn/~ll/files/NIPS2014_JointTask.pdf
github: https://github.com/xiaolonw/nips14_loc_seg_testonly
dataset: http://objectextraction.github.io/

Highly Efficient Forward and Backward Propagation of Convolutional Neural Networks for Pixelwise Classification

arxiv: https://arxiv.org/abs/1412.4526
code(Caffe): https://dl.dropboxusercontent.com/u/6448899/caffe.zip
author page: http://www.ee.cuhk.edu.hk/~hsli/

Segmentation from Natural Language Expressions

intro: ECCV 2016
project page: http://ronghanghu.com/text_objseg/
arxiv: http://arxiv.org/abs/1603.06180
github(TensorFlow): https://github.com/ronghanghu/text_objseg
gtihub(Caffe): https://github.com/Seth-Park/text_objseg_caffe

Semantic Object Parsing with Graph LSTM

arxiv: http://arxiv.org/abs/1603.07063

Fine Hand Segmentation using Convolutional Neural Networks

arxiv: http://arxiv.org/abs/1608.07454

Feedback Neural Network for Weakly Supervised Geo-Semantic Segmentation

intro: Facebook Connectivity Lab & Facebook Core Data Science & University of Illinois
arxiv: https://arxiv.org/abs/1612.02766

FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics

arxiv: https://arxiv.org/abs/1612.05360

A deep learning model integrating FCNNs and CRFs for brain tumor segmentation

arxiv: https://arxiv.org/abs/1702.04528

Texture segmentation with Fully Convolutional Networks

intro: Dublin City University
arxiv: https://arxiv.org/abs/1703.05230

Fast LIDAR-based Road Detection Using Convolutional Neural Networks

https://arxiv.org/abs/1703.03613

Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

arxiv: https://arxiv.org/abs/1703.04363
demo: https://gyglim.github.io/deep-value-net/

Annotating Object Instances with a Polygon-RNN

intro: CVPR 2017. CVPR Best Paper Honorable Mention Award
intro: University of Toronto
keywords: PolygonRNN
project page: http://www.cs.toronto.edu/polyrnn/
arxiv: https://arxiv.org/abs/1704.05548

Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++

intro: CVPR 2018
keywords: PolygonRNN++
project page: http://www.cs.toronto.edu/polyrnn/
arxiv: https://arxiv.org/abs/1803.09693
github: https://github.com/davidjesusacu/polyrnn-pp

Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF

intro: CVPR 2017
paper: http://openaccess.thecvf.com/content_cvpr_2017/papers/Shen_Semantic_Segmentation_via_CVPR_2017_paper.pdf
github(Caffe): https://github.com//FalongShen/SegModel

Distantly Supervised Road Segmentation

intro: ICCV workshop CVRSUAD2017. Indiana University & Preferred Networks
arxiv: https://arxiv.org/abs/1708.06118

Ω-Net: Fully Automatic, Multi-View Cardiac MR Detection, Orientation, and Segmentation with Deep Neural Networks

Ω-Net (Omega-Net): Fully Automatic, Multi-View Cardiac MR Detection, Orientation, and Segmentation with Deep Neural Networks

https://arxiv.org/abs/1711.01094

Superpixel clustering with deep features for unsupervised road segmentation

intro: Preferred Networks, Inc & Indiana University
arxiv: https://arxiv.org/abs/1711.05998

Learning to Segment Human by Watching YouTube

intro: TPAMI 2017
arxiv: https://arxiv.org/abs/1710.01457

W-Net: A Deep Model for Fully Unsupervised Image Segmentation

https://arxiv.org/abs/1711.08506

End-to-end detection-segmentation network with ROI convolution

intro: ISBI 2018
arxiv: https://arxiv.org/abs/1801.02722

A Foreground Inference Network for Video Surveillance Using Multi-View Receptive Field

https://arxiv.org/abs/1801.06593

Piecewise Flat Embedding for Image Segmentation

https://arxiv.org/abs/1802.03248

A Pyramid CNN for Dense-Leaves Segmentation

intro: Computer and Robot Vision, Toronto, May 2018
arxiv: https://arxiv.org/abs/1804.01646

Capsules for Object Segmentation

keywords: convolutional-deconvolutional capsule network, SegCaps, U-Net
arxiv: https://arxiv.org/abs/1804.04241

Deep Object Co-Segmentation

https://arxiv.org/abs/1804.06423

Semantic Aware Attention Based Deep Object Co-segmentation

https://arxiv.org/abs/1810.06859

Contextual Hourglass Networks for Segmentation and Density Estimation

https://arxiv.org/abs/1806.04009

U-Net

U-Net: Convolutional Networks for Biomedical Image Segmentation

intro: conditionally accepted at MICCAI 2015
project page: http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/
arxiv: http://arxiv.org/abs/1505.04597
code+data: http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/u-net-release-2015-10-02.tar.gz
github: https://github.com/orobix/retina-unet
github: https://github.com/jakeret/tf_unet
notes: http://zongwei.leanote.com/post/Pa

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

intro: 4th Deep Learning in Medical Image Analysis (DLMIA) Workshop
arxiv: https://arxiv.org/abs/1807.10165

UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation

intro: ICASSP 2020
arxiv: https://arxiv.org/abs/2004.08790
github: https://github.com/ZJUGiveLab/UNet-Version

DeepUNet: A Deep Fully Convolutional Network for Pixel-level Sea-Land Segmentation

https://arxiv.org/abs/1709.00201

TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation

intro: Lyft Inc. & MIT
intro: part of the winning solution (1st out of 735) in the Kaggle: Carvana Image Masking Challenge
arxiv: https://arxiv.org/abs/1801.05746
github: https://github.com/ternaus/TernausNet

A Probabilistic U-Net for Segmentation of Ambiguous Images

intro: DeepMind & German Cancer Research Center
arxiv: https://arxiv.org/abs/1806.05034

Deep Dual Pyramid Network for Barcode Segmentation using Barcode-30k Database

https://arxiv.org/abs/1807.11886

Deep Smoke Segmentation

https://arxiv.org/abs/1809.00774

Smoothed Dilated Convolutions for Improved Dense Prediction

intro: KDD 2018
arxiv: https://arxiv.org/abs/1808.08931
github: https://github.com/divelab/dilated

DASNet: Reducing Pixel-level Annotations for Instance and Semantic Segmentation

https://arxiv.org/abs/1809.06013

Improving Fast Segmentation With Teacher-student Learning

https://arxiv.org/abs/1810.08476

DSNet: An Efficient CNN for Road Scene Segmentation

https://arxiv.org/abs/1904.05022

Line Segment Detection Using Transformers without Edges

https://arxiv.org/abs/2101.01909

Holistic Segmentation

intro: Technical University of Munich & BMW Group & Johns Hopkins University & Google
arxiv: https://arxiv.org/abs/2209.05407

Unified Image Segmentation

K-Net: Towards Unified Image Segmentation

intro: NeurIPS 2021
intro: Nanyang Technological University & Chinese University of Hong Kon & SenseTime Research & Shanghai AI Laborator
project page: https://www.mmlab-ntu.com/project/knet/index.html
arxiv: https://arxiv.org/abs/2106.14855
github: https://github.com/ZwwWayne/K-Net/

Masked-attention Mask Transformer for Universal Image Segmentation

project page: https://bowenc0221.github.io/mask2former/
arxiv: https://arxiv.org/abs/2112.01527
github: https://github.com/facebookresearch/Mask2Former

Mask2Former for Video Instance Segmentation

intro: University of Illinois at Urbana-Champaign (UIUC) & Facebook AI Research (FAIR
arxiv: https://arxiv.org/abs/2112.10764
github: https://github.com/facebookresearch/Mask2Former

Foreground Object Segmentation

Pixel Objectness

project page: http://vision.cs.utexas.edu/projects/pixelobjectness/
arxiv: https://arxiv.org/abs/1701.05349
github: https://github.com/suyogduttjain/pixelobjectness

A Deep Convolutional Neural Network for Background Subtraction

arxiv: https://arxiv.org/abs/1702.01731

Learning Multi-scale Features for Foreground Segmentation

arxiv: https://arxiv.org/abs/1808.01477
github: https://github.com/lim-anggun/FgSegNet_v2

Learning Deep Representations for Semantic Image Parsing: a Comprehensive Overview

https://arxiv.org/abs/1810.04377

Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation

intro: CVPR 2015, PAMI 2016
keywords: deconvolutional layer, crop layer
arxiv: http://arxiv.org/abs/1411.4038
arxiv(PAMI 2016): http://arxiv.org/abs/1605.06211
slides: https://docs.google.com/presentation/d/1VeWFMpZ8XN7OC3URZP4WdXvOGYckoFWGVN7hApoXVnc
slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-pixels.pdf
talk: http://techtalks.tv/talks/fully-convolutional-networks-for-semantic-segmentation/61606/
github(official): https://github.com/shelhamer/fcn.berkeleyvision.org
github: https://github.com/BVLC/caffe/wiki/Model-Zoo#fcn
github: https://github.com/MarvinTeichmann/tensorflow-fcn
github(Chainer): https://github.com/wkentaro/fcn
github: https://github.com/wkentaro/pytorch-fcn
github: https://github.com/shekkizh/FCN.tensorflow
notes: http://zhangliliang.com/2014/11/28/paper-note-fcn-segment/

From Image-level to Pixel-level Labeling with Convolutional Networks

intro: CVPR 2015
intro: “Weakly Supervised Semantic Segmentation with Convolutional Networks”
intro: performs semantic segmentation based only on image-level annotations in a multiple instance learning framework
arxiv: http://arxiv.org/abs/1411.6228
paper: http://ronan.collobert.com/pub/matos/2015_semisupsemseg_cvpr.pdf

Feedforward semantic segmentation with zoom-out features

intro: CVPR 2015. Toyota Technological Institute at Chicago
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Mostajabi_Feedforward_Semantic_Segmentation_2015_CVPR_paper.pdf
bitbuckt: https://bitbucket.org/m_mostajabi/zoom-out-release
video: https://www.youtube.com/watch?v=HvgvX1LXQa8

DeepLab

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs

intro: ICLR 2015. DeepLab
arxiv: http://arxiv.org/abs/1412.7062
bitbucket: https://bitbucket.org/deeplab/deeplab-public/
github: https://github.com/TheLegendAli/DeepLab-Context

Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation

intro: DeepLab
arxiv: http://arxiv.org/abs/1502.02734
bitbucket: https://bitbucket.org/deeplab/deeplab-public/
github: https://github.com/TheLegendAli/DeepLab-Context

DeepLab v2

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

intro: TPAMI
intro: 79.7% mIOU in the test set, PASCAL VOC-2012 semantic image segmentation task
intro: Updated version of our previous ICLR 2015 paper
project page: http://liangchiehchen.com/projects/DeepLab.html
arxiv: https://arxiv.org/abs/1606.00915
bitbucket: https://bitbucket.org/aquariusjay/deeplab-public-ver2
github: https://github.com/DrSleep/tensorflow-deeplab-resnet
github: https://github.com/isht7/pytorch-deeplab-resnet

DeepLabv2 (ResNet-101)

http://liangchiehchen.com/projects/DeepLabv2_resnet.html

DeepLab v3

Rethinking Atrous Convolution for Semantic Image Segmentation

intro: Google. DeepLabv3
arxiv: https://arxiv.org/abs/1706.05587

DeepLabv3+

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

intro: Google Inc.
arxiv: https://arxiv.org/abs/1802.02611
github: https://github.com/tensorflow/models/tree/master/research/deeplab
blog: https://research.googleblog.com/2018/03/semantic-image-segmentation-with.html
github: https://github.com/hualin95/Deeplab-v3plus

DeeperLab

DeeperLab: Single-Shot Image Parser

intro: MIT & Google Inc. & UC Berkeley
arxiv: https://arxiv.org/abs/1902.05093

Auto-DeepLab

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

intro: CVPR 2019 oral
intro: Johns Hopkins University & Google & Stanford University
arxiv: https://arxiv.org/abs/1901.02985
github: https://github.com/tensorflow/models/tree/master/research/deeplab

Conditional Random Fields as Recurrent Neural Networks

intro: ICCV 2015
intro: Oxford / Stanford / Baidu
keywords: CRF-RNN
project page: http://www.robots.ox.ac.uk/~szheng/CRFasRNN.html
arxiv: http://arxiv.org/abs/1502.03240
github: https://github.com/torrvision/crfasrnn
demo: http://www.robots.ox.ac.uk/~szheng/crfasrnndemo
github: https://github.com/martinkersner/train-CRF-RNN

BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation

arxiv: http://arxiv.org/abs/1503.01640

Efficient piecewise training of deep structured models for semantic segmentation

intro: CVPR 2016
arxiv: http://arxiv.org/abs/1504.01013

Learning Deconvolution Network for Semantic Segmentation

intro: ICCV 2015
intro: two-stage training: train the network with easy examples first and fine-tune the trained network with more challenging examples later
keywords: DeconvNet
project page: http://cvlab.postech.ac.kr/research/deconvnet/
arxiv: http://arxiv.org/abs/1505.04366
slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w06-deconvnet.pdf
gitxiv: http://gitxiv.com/posts/9tpJKNTYksN5eWcHz/learning-deconvolution-network-for-semantic-segmentation
github: https://github.com/HyeonwooNoh/DeconvNet
github: https://github.com/HyeonwooNoh/caffe

SegNet

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling

arxiv: http://arxiv.org/abs/1505.07293
github: https://github.com/alexgkendall/caffe-segnet
github: https://github.com/pfnet-research/chainer-segnet

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

homepage: http://mi.eng.cam.ac.uk/projects/segnet/
arxiv: http://arxiv.org/abs/1511.00561
github: https://github.com/alexgkendall/caffe-segnet
tutorial: http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html

SegNet: Pixel-Wise Semantic Labelling Using a Deep Networks

youtube: https://www.youtube.com/watch?v=xfNYAly1iXo
mirror: http://pan.baidu.com/s/1gdUzDlD

Getting Started with SegNet

ParseNet: Looking Wider to See Better

intro:ICLR 2016
arxiv: http://arxiv.org/abs/1506.04579
github: https://github.com/weiliu89/caffe/tree/fcn
caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#parsenet-looking-wider-to-see-better

Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation

intro: ICLR 2016
keywords: DecoupledNet
project(paper+code): http://cvlab.postech.ac.kr/research/decouplednet/
arxiv: http://arxiv.org/abs/1506.04924
github: https://github.com/HyeonwooNoh/DecoupledNet

Semantic Image Segmentation via Deep Parsing Network

intro: ICCV 2015. CUHK
keywords: Deep Parsing Network (DPN), Markov Random Field (MRF)
homepage: http://personal.ie.cuhk.edu.hk/~lz013/projects/DPN.html
arxiv.org: http://arxiv.org/abs/1509.02634
paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Liu_Semantic_Image_Segmentation_ICCV_2015_paper.pdf
slides: http://personal.ie.cuhk.edu.hk/~pluo/pdf/presentation_dpn.pdf

Multi-Scale Context Aggregation by Dilated Convolutions

intro: ICLR 2016.
intro: Dilated Convolution for Semantic Image Segmentation
homepage: http://vladlen.info/publications/multi-scale-context-aggregation-by-dilated-convolutions/
arxiv: http://arxiv.org/abs/1511.07122
github: https://github.com/fyu/dilation
github: https://github.com/nicolov/segmentation_keras
notes: http://www.inference.vc/dilated-convolutions-and-kronecker-factorisation/

Instance-aware Semantic Segmentation via Multi-task Network Cascades

intro: CVPR 2016 oral. 1st-place winner of MS COCO 2015 segmentation competition
keywords: RoI warping layer, Multi-task Network Cascades (MNC)
arxiv: http://arxiv.org/abs/1512.04412
github: https://github.com/daijifeng001/MNC

Object Segmentation on SpaceNet via Multi-task Network Cascades (MNC)

Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network

intro: TransferNet
project page: http://cvlab.postech.ac.kr/research/transfernet/
arxiv: http://arxiv.org/abs/1512.07928
github: https://github.com/maga33/TransferNet

Combining the Best of Convolutional Layers and Recurrent Layers: A Hybrid Network for Semantic Segmentation

arxiv: http://arxiv.org/abs/1603.04871

Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation

ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation

project page: http://research.microsoft.com/en-us/um/people/jifdai/downloads/scribble_sup/
arxiv: http://arxiv.org/abs/1604.05144

Laplacian Reconstruction and Refinement for Semantic Segmentation

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation

intro: ECCV 2016
arxiv: https://arxiv.org/abs/1605.02264
paper: https://www.ics.uci.edu/~fowlkes/papers/gf-eccv16.pdf
github(MatConvNet): https://github.com/golnazghiasi/LRR

Natural Scene Image Segmentation Based on Multi-Layer Feature Extraction

arxiv: http://arxiv.org/abs/1605.07586

Convolutional Random Walk Networks for Semantic Image Segmentation

arxiv: http://arxiv.org/abs/1605.07681

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

arxiv: http://arxiv.org/abs/1606.02147
github: https://github.com/e-lab/ENet-training
github(Caffe): https://github.com/TimoSaemann/ENet
github: https://github.com/PavlosMelissinos/enet-keras
github: https://github.com/kwotsin/TensorFlow-ENet
blog: http://culurciello.github.io/tech/2016/06/20/training-enet.html

Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery

arxiv: http://arxiv.org/abs/1606.02585

Deep Learning Markov Random Field for Semantic Segmentation

arxiv: http://arxiv.org/abs/1606.07230

Region-based semantic segmentation with end-to-end training

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1607.07671
githun: https://github.com/nightrome/matconvnet-calvin

Built-in Foreground/Background Prior for Weakly-Supervised Semantic Segmentation

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1609.00446

PixelNet: Towards a General Pixel-level Architecture

intro: semantic segmentation, edge detection
arxiv: http://arxiv.org/abs/1609.06694

Exploiting Depth from Single Monocular Images for Object Detection and Semantic Segmentation

intro: IEEE T. Image Processing
intro: propose an RGB-D semantic segmentation method which applies a multi-task training scheme: semantic label prediction and depth value regression
arxiv: https://arxiv.org/abs/1610.01706

PixelNet: Representation of the pixels, by the pixels, and for the pixels

intro: CMU & Adobe Research
project page: http://www.cs.cmu.edu/~aayushb/pixelNet/
arxiv: https://arxiv.org/abs/1702.06506
github(Caffe): https://github.com/aayushbansal/PixelNet

Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks

arxiv: http://arxiv.org/abs/1609.06846

Deep Structured Features for Semantic Segmentation

arxiv: http://arxiv.org/abs/1609.07916

CNN-aware Binary Map for General Semantic Segmentation

intro: ICIP 2016 Best Paper / Student Paper Finalist
arxiv: https://arxiv.org/abs/1609.09220

Efficient Convolutional Neural Network with Binary Quantization Layer

arxiv: https://arxiv.org/abs/1611.06764

Mixed context networks for semantic segmentation

intro: Hikvision Research Institute
arxiv: https://arxiv.org/abs/1610.05854

High-Resolution Semantic Labeling with Convolutional Neural Networks

arxiv: https://arxiv.org/abs/1611.01962

Gated Feedback Refinement Network for Dense Image Labeling

intro: CVPR 2017
paper: http://www.cs.umanitoba.ca/~ywang/papers/cvpr17.pdf

RefineNet: Multi-Path Refinement Networks with Identity Mappings for High-Resolution Semantic Segmentation

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

intro: CVPR 2017. IoU 83.4% on PASCAL VOC 2012
arxiv: https://arxiv.org/abs/1611.06612
github: https://github.com/guosheng/refinenet
leaderboard: http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6#KEY_Multipath-RefineNet-Res152

Light-Weight RefineNet for Real-Time Semantic Segmentation

intro: BMVC 2018
arxiv: https://arxiv.org/abs/1810.03272
github: https://github.com/drsleep/light-weight-refinenet

Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes

keywords: Full-Resolution Residual Units (FRRU), Full-Resolution Residual Networks (FRRNs)
arxiv: https://arxiv.org/abs/1611.08323
github(Theano/Lasagne): https://github.com/TobyPDE/FRRN
youtube: https://www.youtube.com/watch?v=PNzQ4PNZSzc

Semantic Segmentation using Adversarial Networks

intro: Facebook AI Research & INRIA. NIPS Workshop on Adversarial Training, Dec 2016, Barcelona, Spain
arxiv: https://arxiv.org/abs/1611.08408
github(Chainer): https://github.com/oyam/Semantic-Segmentation-using-Adversarial-Networks

Improving Fully Convolution Network for Semantic Segmentation

arxiv: https://arxiv.org/abs/1611.08986

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation

intro: Montreal Institute for Learning Algorithms & Ecole Polytechnique de Montreal
arxiv: https://arxiv.org/abs/1611.09326
github: https://github.com/SimJeg/FC-DenseNet
github: https://github.com/titu1994/Fully-Connected-DenseNets-Semantic-Segmentation
github(Keras): https://github.com/0bserver07/One-Hundred-Layers-Tiramisu

Training Bit Fully Convolutional Network for Fast Semantic Segmentation

intro: Megvii
arxiv: https://arxiv.org/abs/1612.00212

Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection

intro: “an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. “
arxiv: https://arxiv.org/abs/1612.01337

Diverse Sampling for Self-Supervised Learning of Semantic Segmentation

arxiv: https://arxiv.org/abs/1612.01991

Mining Pixels: Weakly Supervised Semantic Segmentation Using Image Labels

intro: Nankai University & University of Oxford & NUS
arxiv: https://arxiv.org/abs/1612.02101

FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation

arxiv: https://arxiv.org/abs/1612.02649

Understanding Convolution for Semantic Segmentation

intro: UCSD & CMU & UIUC & TuSimple
arxiv: https://arxiv.org/abs/1702.08502
github(MXNet): [https://github.com/TuSimple/TuSimple-DUC]https://github.com/TuSimple/TuSimple-DUC
pretrained-models: https://drive.google.com/drive/folders/0B72xLTlRb0SoREhISlhibFZTRmM

Label Refinement Network for Coarse-to-Fine Semantic Segmentation

https://www.arxiv.org/abs/1703.00551

Predicting Deeper into the Future of Semantic Segmentation

intro: Facebook AI Research
arxiv: https://arxiv.org/abs/1703.07684

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

intro: CVPR 2017 (oral)
keywords: Adversarial Erasing (AE)
arxiv: https://arxiv.org/abs/1703.08448

Guided Perturbations: Self Corrective Behavior in Convolutional Neural Networks

intro: University of Maryland & GE Global Research Center
arxiv: https://arxiv.org/abs/1703.07928

Not All Pixels Are Equal: Difficulty-aware Semantic Segmentation via Deep Layer Cascade

intro: CVPR 2017 spotlight paper
arxxiv: https://arxiv.org/abs/1704.01344

Large Kernel Matters – Improve Semantic Segmentation by Global Convolutional Network

https://arxiv.org/abs/1703.02719

Loss Max-Pooling for Semantic Image Segmentation

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.02966

Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation

https://arxiv.org/abs/1704.03593

A Review on Deep Learning Techniques Applied to Semantic Segmentation

https://arxiv.org/abs/1704.06857

Joint Semantic and Motion Segmentation for dynamic scenes using Deep Convolutional Networks

intro: [International Institute of Information Technology & Max Planck Institute For Intelligent Systems
arxiv: https://arxiv.org/abs/1704.08331

ICNet for Real-Time Semantic Segmentation on High-Resolution Images

intro: CUHK & Sensetime
project page: https://hszhao.github.io/projects/icnet/
arxiv: https://arxiv.org/abs/1704.08545
github: https://github.com/hszhao/ICNet
video: https://www.youtube.com/watch?v=qWl9idsCuLQ

Feature Forwarding: Exploiting Encoder Representations for Efficient Semantic Segmentation

LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation

project page: https://codeac29.github.io/projects/linknet/
arxiv: https://arxiv.org/abs/1707.03718
github: https://github.com/e-lab/LinkNet

Pixel Deconvolutional Networks

intro: Washington State University
arxiv: https://arxiv.org/abs/1705.06820

Incorporating Network Built-in Priors in Weakly-supervised Semantic Segmentation

intro: IEEE TPAMI
arxiv: https://arxiv.org/abs/1706.02189

Deep Semantic Segmentation for Automated Driving: Taxonomy, Roadmap and Challenges

intro: IEEE ITSC 2017
arxiv: https://arxiv.org/abs/1707.02432

Semantic Segmentation with Reverse Attention

intro: BMVC 2017 oral. University of Southern California
arxiv: https://arxiv.org/abs/1707.06426

Stacked Deconvolutional Network for Semantic Segmentation

https://arxiv.org/abs/1708.04943

Learning Dilation Factors for Semantic Segmentation of Street Scenes

intro: GCPR 2017
arxiv: https://arxiv.org/abs/1709.01956

A Self-aware Sampling Scheme to Efficiently Train Fully Convolutional Networks for Semantic Segmentation

https://arxiv.org/abs/1709.02764

One-Shot Learning for Semantic Segmentation

intro: BMWC 2017
arcxiv: https://arxiv.org/abs/1709.03410
github: https://github.com/lzzcd001/OSLSM

An Adaptive Sampling Scheme to Efficiently Train Fully Convolutional Networks for Semantic Segmentation

https://arxiv.org/abs/1709.02764

Semantic Segmentation from Limited Training Data

https://arxiv.org/abs/1709.07665

Unsupervised Domain Adaptation for Semantic Segmentation with GANs

https://arxiv.org/abs/1711.06969

Neuron-level Selective Context Aggregation for Scene Segmentation

https://arxiv.org/abs/1711.08278

Road Extraction by Deep Residual U-Net

https://arxiv.org/abs/1711.10684

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

intro: AAAI 2018
project page: http://mmlab.ie.cuhk.edu.hk/projects/M&M/
arxiv: https://arxiv.org/abs/1712.00661
github: https://github.com/XiaohangZhan/mix-and-match/
github: https://github.com//liuziwei7/mix-and-match

Error Correction for Dense Semantic Image Labeling

https://arxiv.org/abs/1712.03812

Semantic Segmentation via Highly Fused Convolutional Network with Multiple Soft Cost Functions

https://arxiv.org/abs/1801.01317

RTSeg: Real-time Semantic Segmentation Comparative Study

arxiv: https://arxiv.org/abs/1803.02758
github: https://github.com/MSiam/TFSegmentation

ShuffleSeg: Real-time Semantic Segmentation Network

intro: Cairo University
arxiv: https://arxiv.org/abs/1803.03816

Dynamic-structured Semantic Propagation Network

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1803.06067

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

project page: https://sacmehta.github.io/ESPNet/
arxiv: https://arxiv.org/abs/1803.06815
github: https://github.com/sacmehta/ESPNet

Context Encoding for Semantic Segmentation

intro: CVPR 2018
keywords: Synchronized Cross-GPU Batch Normalization
arxiv: https://arxiv.org/abs/1803.08904
github: https://github.com/zhanghang1989/PyTorch-Encoding

Adaptive Affinity Field for Semantic Segmentation

intro: UC Berkeley / ICSI
arxiv: https://arxiv.org/abs/1803.10335

Predicting Future Instance Segmentations by Forecasting Convolutional Features

intro: Facebook AI Research & Univ. Grenoble Alpes
arxiv: https://arxiv.org/abs/1803.11496

Fully Convolutional Adaptation Networks for Semantic Segmentation

intro: CVPR 2018, Rank 1 in Segmentation Track of Visual Domain Adaptation Challenge 2017
keywords: Fully Convolutional Adaptation Networks (FCAN), Appearance Adaptation Networks (AAN) and Representation Adaptation Networks (RAN)
arxiv: https://arxiv.org/abs/1804.08286

Learning a Discriminative Feature Network for Semantic Segmentation

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1804.09337

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

https://arxiv.org/abs/1805.04141

Convolutional CRFs for Semantic Segmentation

arxiv: https://arxiv.org/abs/1805.04777
github: https://github.com/MarvinTeichmann/ConvCRF

ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time

intro: Toshiba Research
arxiv: https://arxiv.org/abs/1805.04554

DifNet: Semantic Segmentation by DiffusionNetworks

https://arxiv.org/abs/1805.08015

Pyramid Attention Network for Semantic Segmentation

https://arxiv.org/abs/1805.10180

Semantic Segmentation with Scarce Data

intro: ICML 2018 Workshop
arxiv: https://arxiv.org/abs/1807.00911

Attention to Refine through Multi-Scales for Semantic Segmentation

https://arxiv.org/abs/1807.02917

Guided Upsampling Network for Real-Time Semantic Segmentation

intro: BMVC 2018
arxiv: https://arxiv.org/abs/1807.07466

Deep Learning for Semantic Segmentation on Minimal Hardware

intro: RoboCup International Symposium 2018. University of Hertfordshire
arxiv: https://arxiv.org/abs/1807.05597

Future Semantic Segmentation with Convolutional LSTM

intro: BMVC 2018
arxiv: https://arxiv.org/abs/1807.07946

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation

intro: ECCV 2018
arxiv: https://arxiv.org/abs/1808.00897

Dual Attention Network for Scene Segmentation

https://arxiv.org/abs/1809.02983

Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations

https://arxiv.org/abs/1809.04766

Efficient Dense Modules of Asymmetric Convolution for Real-Time Semantic Segmentation

https://arxiv.org/abs/1809.06323

Semantic Image Segmentation by Scale-Adaptive Networks

github(Caffe): https://github.com/speedinghzl/Scale-Adaptive-Network

Recurrent Iterative Gating Networks for Semantic Segmentation

intro: WACV 2019
arxiv: https://arxiv.org/abs/1811.08043

CGNet: A Light-weight Context Guided Network for Semantic Segmentation

arxiv: https://arxiv.org/abs/1811.08201
github: https://github.com/wutianyiRosun/CGNet

CCNet: Criss-Cross Attention for Semantic Segmentation

intro: Huazhong University of Science and Technology & Horizon Robotics & University of Illinois at Urbana-Champaign
arxiv: https://arxiv.org/abs/1811.11721
github: https://github.com/speedinghzl/CCNet

ShelfNet for Real-time Semantic Segmentation

intro: Yale University
arxiv: https://arxiv.org/abs/1811.11254
github: https://github.com/juntang-zhuang/ShelfNet

Improving Semantic Segmentation via Video Propagation and Label Relaxation

intro: CVPR 2019 oral
arxiv: https://arxiv.org/abs/1812.01593
github: https://github.com/NVIDIA/semantic-segmentation

RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free

arxiv: https://arxiv.org/abs/1901.03353
github: https://github.com/chengyangfu/retinamask

Fast-SCNN: Fast Semantic Segmentation Network

https://arxiv.org/abs/1902.04502

Structured Knowledge Distillation for Semantic Segmentation

intro: CVPR 2019
arxiv: https://arxiv.org/abs/1903.04197

In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images

intro: CVPR 2019
intro: University of Zagreb
keywords: SwiftNet
arxiv: https://arxiv.org/abs/1903.08469
github: https://github.com/orsic/swiftnet

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation

intro: Chinese Academy of Sciences & Deepwise AI Lab
keywords: Joint Pyramid Upsampling (JPU)
project page: http://wuhuikai.me/FastFCNProject/
arxiv: https://arxiv.org/abs/1903.11816
github: https://github.com/wuhuikai/FastFCN

Significance-aware Information Bottleneck for Domain Adaptive Semantic Segmentation

intro: HUST & UTS
arxiv: https://arxiv.org/abs/1904.00876

GFF: Gated Fully Fusion for Semantic Segmentation

https://arxiv.org/abs/1904.01803

DADA: Depth-aware Domain Adaptation in Semantic Segmentation

https://arxiv.org/abs/1904.01886

DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

intro: Megvii Technology
arxiv: https://arxiv.org/abs/1904.02216

ESNet: An Efficient Symmetric Network for Real-time Semantic Segmentation

arxiv: https://arxiv.org/abs/1906.09826
github(official): https://github.com/xiaoyufenfei/ESNet

Gated-SCNN: Gated Shape CNNs for Semantic Segmentation

intro: NVIDIA & University of Waterloo & University of Toronto & Vector Institute
project page: https://nv-tlabs.github.io/GSCNN/
arxiv: https://arxiv.org/abs/1907.05740

DABNet: Depth-wise Asymmetric Bottleneck for Real-time Semantic Segmentation

intro: BMVC 2019
arxiv: https://arxiv.org/abs/1907.11830

Dynamic Graph Message Passing Networks

intro: CVPR 2020 oral
arxiv: https://arxiv.org/abs/1908.06955

Squeeze-and-Attention Networks for Semantic Segmentation

https://arxiv.org/abs/1909.03402

Global Aggregation then Local Distribution in Fully Convolutional Networks

intro: BMVC 2019
arxiv: https://arxiv.org/abs/1909.07229
github: https://github.com/lxtGH/GALD-Net

Graph-guided Architecture Search for Real-time Semantic Segmentation

https://arxiv.org/abs/1909.06793

Feature Pyramid Encoding Network for Real-time Semantic Segmentation

intro: BMVC 2019
arxiv: https://arxiv.org/abs/1909.08599

ACFNet: Attentional Class Feature Network for Semantic Segmentation

intro: ICCV 2019
arxiv: https://arxiv.org/abs/1909.09408

Region Mutual Information Loss for Semantic Segmentation

intro: NeurIPS 2019
arxiv: https://arxiv.org/abs/1910.12037
github: https://github.com/ZJULearning/RMI

Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation

intro: NeurIPS 2019
arxiv: https://arxiv.org/abs/1910.13049
github: https://github.com/RogerZhangzz/CAG_UDA

Efficacy of Pixel-Level OOD Detection for Semantic Segmentation

https://arxiv.org/abs/1911.02897

Location-aware Upsampling for Semantic Segmentation

keywords: LaU
arxiv: https://arxiv.org/abs/1911.05250
github: https://github.com/HolmesShuan/Location-aware-Upsampling-for-Semantic-Segmentation

FasterSeg: Searching for Faster Real-time Semantic Segmentation

intro: ICLR 2020
intro: Texas A&M University & Horizon Robotics Inc.
arxiv: https://arxiv.org/abs/1912.10917

AlignSeg: Feature-Aligned Segmentation Networks

https://arxiv.org/abs/2003.00872

Deep Grouping Model for Unified Perceptual Parsing

intro: CVPR 2020
arxiv: https://arxiv.org/abs/2003.11647

Spatial Pyramid Based Graph Reasoning for Semantic Segmentation

intro: CVPR 2020
arxiv: https://arxiv.org/abs/2003.10211

Learning Dynamic Routing for Semantic Segmentation

intro: CVPR 2020 oral
arxiv: https://arxiv.org/abs/2003.10401
giihub(official): https://github.com/yanwei-li/DynamicRouting

Learning to Predict Context-adaptive Convolution for Semantic Segmentation

https://arxiv.org/abs/2004.08222

Transferring and Regularizing Prediction for Semantic Segmentation

intro: CVPR 2020
arxiv: https://arxiv.org/abs/2006.06570

Tensor Low-Rank Reconstruction for Semantic Segmentation

intro: ECCV 2020
intro: Top-1 performance on PASCAL-VOC12
arxiv: https://arxiv.org/abs/2008.00490
github: https://github.com/CWanli/RecoNet

Representative Graph Neural Network

intro: ECCV 2020
arxiv: https://arxiv.org/abs/2008.05202

EfficientFCN: Holistically-guided Decoding for Semantic Segmentation

https://arxiv.org/abs/2008.10487

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

intro: ECCV 2020
arxiv: https://arxiv.org/abs/2007.10035
github: https://github.com/lxtGH/DecoupleSegNets

Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

https://arxiv.org/abs/2010.07930

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation

arxiv: https://arxiv.org/abs/2010.09713
github: https://github.com/googleinterns/wss

Importance-Aware Semantic Segmentation in Self-Driving with Discrete Wasserstein Training

intro: AAAI 2020
arxiv: https://arxiv.org/abs/2010.12440

Pixel-Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

intro: NeurIPS 2020 oral
arxiv: https://arxiv.org/abs/2011.00147
github: https://github.com/kgl-prml/Pixel-Level-Cycle-Association

CABiNet: Efficient Context Aggregation Network for Low-Latency Semantic Segmentation

intro: University of Twente
arxiv: https://arxiv.org/abs/2011.00993

SegBlocks: Block-Based Dynamic Resolution Networks for Real-Time Segmentation

https://arxiv.org/abs/2011.12025

Channel-wise Distillation for Semantic Segmentation

arxiv: https://arxiv.org/abs/2011.13256
github: https://github.com/drilistbox

BoxInst: High-Performance Instance Segmentation with Box Annotations

intro: University of Adelaide
arxiv: https://arxiv.org/abs/2012.02310
github: https://github.com/aim-uofa/AdelaiDet/

Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU

arxiv: https://arxiv.org/abs/2012.07489
github: https://github.com/shipra25jain/ESSNet

Cross-Domain Grouping and Alignment for Domain Adaptive Semantic Segmentation

intro: AAAI 2021
arxiv: https://arxiv.org/abs/2012.08226

HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation

intro: Facebook AI & Tel Aviv University
arxiv: https://arxiv.org/abs/2012.11582

SETR

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

intro: CVPR 2021
intro: Fudan University & University of Oxford & University of Surrey & Tencent Youtu Lab & Facebook AI
project page: https://fudan-zvg.github.io/SETR/
arxiv: https://arxiv.org/abs/2012.15840
github: https://github.com/fudan-zvg/SETR

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

intro: ICCV 2021 oral
intro: Computer Vision Lab, ETH Zurich & SenseTime Research
arxiv: https://arxiv.org/abs/2101.11939
github: https://github.com/tfzhou/ContrastiveSeg

Active Boundary Loss for Semantic Segmentation

https://arxiv.org/abs/2102.02696

Learning Statistical Texture for Semantic Segmentation

intro: CVPR 2021
intro: Beihang University & SenseTime Research
arxiv: https://arxiv.org/abs/2103.04133

Cross-Dataset Collaborative Learning for Semantic Segmentation

intro: CVPR 2021
intro: Xilinx Inc. & Chinese Academy of Sciences
arxiv: https://arxiv.org/abs/2103.11351

Vision Transformers for Dense Prediction

intro: Intel Labs
arxiv: https://arxiv.org/abs/2103.13413
github: https://github.com/intel-isl/DPT

InverseForm: A Loss Function for Structured Boundary-Aware Segmentation

intro: CVPR 2021 oral
intro: Qualcomm AI Research
arxiv: https://arxiv.org/abs/2104.02745

Rethinking BiSeNet For Real-time Semantic Segmentation

intro: Meituan
intro: CVPR 2021
arxiv: https://arxiv.org/abs/2104.13188
github: https://github.com/MichaelFan01/STDC-Seg

Segmenter: Transformer for Semantic Segmentation

intro: Inria
arxiv: https://arxiv.org/abs/2105.05633
github: https://github.com/rstrudel/segmenter

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

https://arxiv.org/abs/2105.15203

Per-Pixel Classification is Not All You Need for Semantic Segmentation

keywords: UIUC & FAIR
project page: https://bowenc0221.github.io/maskformer/
arxiv: https://arxiv.org/abs/2107.06278

A Unified Efficient Pyramid Transformer for Semantic Segmentation

intro: School of Data Science, Fudan University & Amazon Web Services & University of California, Davis
arxiv: https://arxiv.org/abs/2107.14209

Deep Metric Learning for Open World Semantic Segmentation

intro: ICCV 2021
arxiv: https://arxiv.org/abs/2108.04562

Multi-Anchor Active Domain Adaptation for Semantic Segmentation

intro: ICCV 2021 Oral
arxiv: https://arxiv.org/abs/2108.08012

Generalize then Adapt: Source-Free Domain Adaptive Semantic Segmentation

intro: ICCV 2021
intro: Indian Institute of Science & Google Research
project page: https://sites.google.com/view/sfdaseg
arxiv: https://arxiv.org/abs/2108.11249

HRFormer: High-Resolution Transformer for Dense Prediction

intro: NeurIPS 2021
intro: University of Chinese Academy of Sciences & Institute of Computing Technology, CAS & Peking University & Microsoft Research Asia & Baidu
arxiv: https://arxiv.org/abs/2110.09408
github: https://github.com/HRNet/HRFormer

Deep Hierarchical Semantic Segmentation

intro: CVPR 2022
arxiv: https://arxiv.org/abs/2203.14335
github: https://github.com/0liliulei/HieraSeg

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation

intro: CVPR 2022
arxiv: https://arxiv.org/abs/2204.05525
github: https://github.com/hustvl/TopFormer

Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation

intro: The Hong Kong University of Science and Technology & Tsinghua University & International Digital Economy Academy (IDEA) & The Hong Kong University of Science and Technology (Guangzhou)
arxiv: https://arxiv.org/abs/2206.02777
github: https://github.com/IDEACVR/MaskDINO

Instance Segmentation

Simultaneous Detection and Segmentation

intro: ECCV 2014
author: Bharath Hariharan, Pablo Arbelaez, Ross Girshick, Jitendra Malik
arxiv: http://arxiv.org/abs/1407.1808
github(Matlab): https://github.com/bharath272/sds_eccv2014

Convolutional Feature Masking for Joint Object and Stuff Segmentation

intro: CVPR 2015
keywords: masking layers
arxiv: https://arxiv.org/abs/1412.1283
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Dai_Convolutional_Feature_Masking_2015_CVPR_paper.pdf

Proposal-free Network for Instance-level Object Segmentation

paper: http://arxiv.org/abs/1509.02636

Hypercolumns for object segmentation and fine-grained localization

intro: CVPR 2015
arxiv: https://arxiv.org/abs/1411.5752
paper: http://www.cs.berkeley.edu/~bharath2/pubs/pdfs/BharathCVPR2015.pdf

SDS using hypercolumns

github: https://github.com/bharath272/sds

Learning to decompose for object detection and instance segmentation

intro: ICLR 2016 Workshop
keyword: CNN / RNN, MNIST, KITTI
arxiv: http://arxiv.org/abs/1511.06449

Recurrent Instance Segmentation

intro: ECCV 2016
porject page: http://romera-paredes.com/ris
arxiv: http://arxiv.org/abs/1511.08250
github(Torch): https://github.com/bernard24/ris
poster: http://www.eccv2016.org/files/posters/P-4B-46.pdf
youtube: https://www.youtube.com/watch?v=l_WD2OWOqBk

Instance-sensitive Fully Convolutional Networks

intro: ECCV 2016. instance segment proposal
arxiv: http://arxiv.org/abs/1603.08678

Amodal Instance Segmentation

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1604.08202

Bridging Category-level and Instance-level Semantic Image Segmentation

keywords: online bootstrapping
arxiv: http://arxiv.org/abs/1605.06885

Bottom-up Instance Segmentation using Deep Higher-Order CRFs

intro: BMVC 2016
arxiv: http://arxiv.org/abs/1609.02583

DeepCut: Object Segmentation from Bounding Box Annotations using Convolutional Neural Networks

arxiv: http://arxiv.org/abs/1605.07866

End-to-End Instance Segmentation and Counting with Recurrent Attention

intro: ReInspect
arxiv: http://arxiv.org/abs/1605.09410

Translation-aware Fully Convolutional Instance Segmentation

Fully Convolutional Instance-aware Semantic Segmentation

intro: CVPR 2017 Spotlight paper. winning entry of COCO segmentation challenge 2016
keywords: TA-FCN / FCIS
arxiv: https://arxiv.org/abs/1611.07709
github: https://github.com/msracver/FCIS
slides: https://onedrive.live.com/?cid=f371d9563727b96f&id=F371D9563727B96F%2197213&authkey=%21AEYOyOirjIutSVk

InstanceCut: from Edges to Instances with MultiCut

arxiv: https://arxiv.org/abs/1611.08272

Deep Watershed Transform for Instance Segmentation

arxiv: https://arxiv.org/abs/1611.08303

Object Detection Free Instance Segmentation With Labeling Transformations

arxiv: https://arxiv.org/abs/1611.08991

Shape-aware Instance Segmentation

arxiv: https://arxiv.org/abs/1612.03129

Interpretable Structure-Evolving LSTM

intro: CMU & Sun Yat-sen University & National University of Singapore & Adobe Research
intro: CVPR 2017 spotlight paper
arxiv: https://arxiv.org/abs/1703.03055

Mask R-CNN

intro: ICCV 2017 Best paper award. Facebook AI Research
arxiv: https://arxiv.org/abs/1703.06870
slides: http://kaiminghe.com/iccv17tutorial/maskrcnn_iccv2017_tutorial_kaiminghe.pdf
github(official, Caffe2): https://github.com/facebookresearch/Detectron
github: https://github.com/facebookresearch/maskrcnn-benchmark
github: https://github.com/TuSimple/mx-maskrcnn
slides: https://lmb.informatik.uni-freiburg.de/lectures/seminar_brox/seminar_ss17/maskrcnn_slides.pdf
github(Keras+TensorFlow): https://github.com/matterport/Mask_RCNN

Faster Training of Mask R-CNN by Focusing on Instance Boundaries

intro: BMW Car IT GmbH
arxiv: https://arxiv.org/abs/1809.07069

Boundary-preserving Mask R-CNN

intro: ECCV 2020
intro: Huazhong University of Science and Technology & Horizon Robotics Inc.
arxiv: https://arxiv.org/abs/2007.08921
github: https://github.com/hustvl/BMaskR-CNN

Semantic Instance Segmentation via Deep Metric Learning

https://arxiv.org/abs/1703.10277

Pose2Instance: Harnessing Keypoints for Person Instance Segmentation

https://arxiv.org/abs/1704.01152

Pixelwise Instance Segmentation with a Dynamically Instantiated Network

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.02386

Instance-Level Salient Object Segmentation

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.03604

MEnet: A Metric Expression Network for Salient Object Segmentation

intro: IJCAI
arxiv: https://arxiv.org/abs/1805.05638

Semantic Instance Segmentation with a Discriminative Loss Function

intro: Published at “Deep Learning for Robotic Vision”, workshop at CVPR 2017
arxiv: https://arxiv.org/abs/1708.02551
github: https://github.com/Wizaron/instance-segmentation-pytorch

SceneCut: Joint Geometric and Object Segmentation for Indoor Scenes

https://arxiv.org/abs/1709.07158

S4 Net: Single Stage Salient-Instance Segmentation

arxiv: https://arxiv.org/abs/1711.07618
github: https://github.com/RuochenFan/S4Net

Deep Extreme Cut: From Extreme Points to Object Segmentation

https://arxiv.org/abs/1711.09081

Learning to Segment Every Thing

intro: CVPR 2018. UC Berkeley & Facebook AI Research
keywords: MaskX R-CNN
project page: http://ronghanghu.com/seg_every_thing/
arxiv: https://arxiv.org/abs/1711.10370
gihtub(official, Caffe2): https://github.com/ronghanghu/seg_every_thing

Recurrent Neural Networks for Semantic Instance Segmentation

project page: https://imatge-upc.github.io/rsis/
arxiv: https://arxiv.org/abs/1712.00617
github: https://github.com/imatge-upc/rsis

MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

intro: Google Inc. & RWTH Aachen University & UCLA
arxiv: https://arxiv.org/abs/1712.04837

Recurrent Pixel Embedding for Instance Grouping

intro: learning to embed pixels and group them into boundaries, object proposals, semantic segments and instances.
project page: http://www.ics.uci.edu/~skong2/SMMMSG.html
arxiv: https://arxiv.org/abs/1712.08273
github: https://github.com/aimerykong/Recurrent-Pixel-Embedding-for-Instance-Grouping
slides: http://www.ics.uci.edu/~skong2/slides/pixel_embedding_for_grouping_public_version.pdf
poster: http://www.ics.uci.edu/~skong2/slides/pixel_embedding_for_grouping_poster.pdf

Annotation-Free and One-Shot Learning for Instance Segmentation of Homogeneous Object Clusters

https://arxiv.org/abs/1802.00383

Path Aggregation Network for Instance Segmentation

intro: CVPR 2018 Spotlight
intro: CUHK & Peking University & SenseTime Research & YouTu Lab
keywords: PANet
arxiv: https://arxiv.org/abs/1803.01534
github: https://github.com/ShuLiu1993/PANet

Learning to Segment via Cut-and-Paste

intro: Google
keywords: weakly-supervised, adversarial learning setup
arxiv: https://arxiv.org/abs/1803.06414

Learning to Cluster for Proposal-Free Instance Segmentation

https://arxiv.org/abs/1803.06459

Bayesian Semantic Instance Segmentation in Open Set World

https://arxiv.org/abs/1806.00911

TernausNetV2: Fully Convolutional Network for Instance Segmentation

arxiv: https://arxiv.org/abs/1806.00844
github: https://github.com/ternaus/TernausNetV2

Dynamic Multimodal Instance Segmentation guided by natural language queries

intro: ECCV 2018
arxiv: https://arxiv.org/abs/1807.02257
github: https://github.com/andfoy/query-objseg

Traits & Transferability of Adversarial Examples against Instance Segmentation & Object Detection

https://arxiv.org/abs/1808.01452

Affinity Derivation and Graph Merge for Instance Segmentation

intro: ECCV 2018
arxiv: https://arxiv.org/abs/1811.10870

One-Shot Instance Segmentation

intro: University of Tubingen
arxiv: https://arxiv.org/abs/1811.11507

Hybrid Task Cascade for Instance Segmentation

intro: CVPR 2019
intro: The Chinese University of Hong Kong & SenseTime Research & Zhejiang University & The University of Sydney & Nanyang Technological University
intro: Winning entry of COCO 2018 Challenge (object detection task)
arxiv: https://arxiv.org/abs/1901.07518
github(mmdetection): https://github.com/open-mmlab/mmdetection/tree/master/configs/htc

Mask Scoring R-CNN

intro: CVPR 2019
intro: Huazhong University of Science and Technology & Horizon Robotics Inc.
arxiv: https://arxiv.org/abs/1903.00241
github: https://github.com/zjhuang22/maskscoring_rcnn

TensorMask: A Foundation for Dense Object Segmentation

intro: Facebook AI Research (FAIR)
arxiv: https://arxiv.org/abs/1903.12174

Actor-Critic Instance Segmentation

intro: CVPR 2019
keywords: reinforcement learning
arxiv: https://arxiv.org/abs/1904.05126

Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth

arxiv: https://arxiv.org/abs/1906.11109
github: https://github.com/davyneven/SpatialEmbeddings

InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting

intro: ICCV 2019
arxiv: https://arxiv.org/abs/1908.07801
github: https://github.com/GothicAi/Instaboost

SSAP: Single-Shot Instance Segmentation With Affinity Pyramid

intro: ICCV 2019
intro: Chinese Academy of Sciences & Horizon Robotics, Inc
arxiv: https://arxiv.org/abs/1909.01616

YOLACT: Real-time Instance Segmentation

intro: You Only Look At CoefficienTs
intro: University of California, Davis
keywords: one-stage, Fast NMS
arxiv: https://arxiv.org/abs/1904.02689
github(official, Pytorch): https://github.com/dbolya/yolact

YOLACT++: Better Real-time Instance Segmentation

https://arxiv.org/abs/1912.06218

YolactEdge: Real-time Instance Segmentation on the Edge

arxiv: https://arxiv.org/abs/2012.12259
github: https://github.com/haotian-liu/yolact_edge

PolarMask: Single Shot Instance Segmentation with Polar Representation

intro: CVPR 2020
arxiv: https://arxiv.org/abs/1909.13226
github: https://github.com/xieenze/PolarMask

PolarMask++: Enhanced Polar Representation for Single-Shot Instance Segmentation and Beyond

intro: TPAMI 2021
arxiv: https://arxiv.org/abs/2105.02184
github: https://github.com/xieenze/PolarMask

CenterMask : Real-Time Anchor-Free Instance Segmentation

intro: CVPR 2020
arxiv: https://arxiv.org/abs/1911.06667
github: https://github.com/youngwanLEE/CenterMask
github: https://github.com/youngwanLEE/centermask2

CenterMask: single shot instance segmentation with point representation

intro: CVPR 2020
intro: Meituan Dianping Group
arxiv: https://arxiv.org/abs/2004.04446

Shape-aware Feature Extraction for Instance Segmentation

intro: CVPR 2020
arxiv: https://arxiv.org/abs/1911.11263

PolyTransform: Deep Polygon Transformer for Instance Segmentation

https://arxiv.org/abs/1912.02801

EmbedMask: Embedding Coupling for One-stage Instance Segmentation

arxiv: https://arxiv.org/abs/1912.01954
gitub: https://github.com/yinghdb/EmbedMask

SAIS: Single-stage Anchor-free Instance Segmentation

https://arxiv.org/abs/1912.01176

SOLO: Segmenting Objects by Locations

arxiv: https://arxiv.org/abs/1912.04488 -github: https://github.com/WXinlong/SOLO

SOLOv2: Dynamic, Faster and Stronger

arxiv: https://arxiv.org/abs/2003.10152
github: https://github.com/aim-uofa/AdelaiDet/

SOLO: A Simple Framework for Instance Segmentation

arxiv: https://arxiv.org/abs/2106.15947
github: https://github.com/aim-uofa/AdelaiDet/

RDSNet: A New Deep Architecture for Reciprocal Object Detection and Instance Segmentation

intro: AAAI 2020
intro: Chinese Academy of Sciences & 2Horizon Robotics Inc.
arxiv: https://arxiv.org/abs/1912.05070
github: https://github.com/wangsr126/RDSNet

BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation

https://arxiv.org/abs/2001.00309

Conditional Convolutions for Instance Segmentation

intro: ECCV 2020 oral
intro: The University of Adelaide
arxiv: https://arxiv.org/abs/2003.05664
github: https://github.com/aim-uofa/adet

PointINS: Point-based Instance Segmentation

intro: CUHK & MEGVII & Chinese Academy of Sciences & SmartMore
arxiv: https://arxiv.org/abs/2003.06148

1st Place Solutions for OpenImage2019 – Object Detection and Instance Segmentation

https://arxiv.org/abs/2003.07557

Mask Encoding for Single Shot Instance Segmentation

intro: CVPR 2020
intro: Tongji University & University of Adelaide & Huawei Noah’s Ark Lab
arxiv: https://arxiv.org/abs/2003.11712

The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation

arxiv: https://arxiv.org/abs/2007.11978
github: https://github.com/twangnh/SimCal

Deep Variational Instance Segmentation

https://arxiv.org/abs/2007.11576

Mask Point R-CNN

https://arxiv.org/abs/2008.00460

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation

intro: ACM MM 2020
arxiv: https://arxiv.org/abs/2008.05676
github: https://github.com/JialianW/Forest_RCNN

Seesaw Loss for Long-Tailed Instance Segmentation

https://arxiv.org/abs/2008.10032

Joint COCO and Mapillary Workshop at ICCV 2019: COCO Instance Segmentation Challenge Track

intro: 1st Place Technical Report in ICCV2019/ ECCV2020: MegDetV2
arxiv: https://arxiv.org/abs/2010.02475

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

intro: Zhejiang University & Alibaba Group
arxiv: https://arxiv.org/abs/2011.09876

The Devil is in the Boundary: Exploiting Boundary Representation for Basis-based Instance Segmentation

https://arxiv.org/abs/2011.13241

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion

https://arxiv.org/abs/2012.02107

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

intro: Google Research & UC Berkeley & Cornell University
arxiv: https://arxiv.org/abs/2012.07177

How Shift Equivariance Impacts Metric Learning for Instance Segmentation

https://arxiv.org/abs/2101.05846

FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation

intro: Nanyang Technological University & Carnegie Mellon Universit
arxiv: https://arxiv.org/abs/2102.12867

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers

intro: CVPR 2021
arxiv: https://arxiv.org/abs/2103.12340
github: https://github.com/lkeab/BCNet
youtube: https://www.youtube.com/watch?v=iHlGJppJGiQ
zhihu: https://zhuanlan.zhihu.com/p/378269087

Sparse Object-level Supervision for Instance Segmentation with Pixel Embeddings

arxiv: https://arxiv.org/abs/2103.14572
github: https://github.com/kreshuklab/spoco

FAPIS: A Few-shot Anchor-free Part-based Instance Segmenter

intro: CVPR 2021
arxiv: https://arxiv.org/abs/2104.00073

ISTR: End-to-End Instance Segmentation with Transformers

arxiv: https://arxiv.org/abs/2105.00637
github: https://github.com/hujiecpp/ISTR

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers

intro: CVPR 2021
intro: The Hong Kong University of Science and Technology & Kuaishou Technology
keywords: BCNet
arxiv: https://arxiv.org/abs/2103.12340
github: https://github.com/lkeab/BCNet

SOLQ: Segmenting Objects by Learning Queries

intro: MEGVII Technology
arxiv: https://arxiv.org/abs/2106.02351
github: https://github.com/megvii-research/SOLQ

1st Place Solution for YouTubeVOS Challenge 2021:Video Instance Segmentation

intro: CPVR 2021 Workshop
arxiv: https://arxiv.org/abs/2106.06649

Rank & Sort Loss for Object Detection and Instance Segmentation

intro: ICCV 2021 Oral
arxiv: https://arxiv.org/abs/2107.11669
github: https://github.com/kemaloksuz/RankSortLoss

SOTR: Segmenting Objects with Transformers

intro: ICCV 2021
arxiv: https://arxiv.org/abs/2108.06747
github: https://github.com/easton-cau/SOTR

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

intro: ICCV 2021
arxiv: https://arxiv.org/abs/2108.07058
github: https://github.com/EMI-Group/FaPN

Instances as Queries

intro: ICCV 2021
intro: HUST & Tencent
arxiv: https://arxiv.org/abs/2105.01928
github: https://github.com/hustvl/QueryInst

Mask Transfiner for High-Quality Instance Segmentation

intro: ETH Zurich & HKUST & Kuaishou Technology
arixv: https://arxiv.org/abs/2111.13673

SOIT: Segmenting Objects with Instance-Aware Transformers

intro: AAAI 2022
arxiv: https://arxiv.org/abs/2112.11037
github: https://github.com/yuxiaodongHRI/SOIT

ContrastMask: Contrastive Learning to Segment Every Thing

intro: CVPR 2022
arxiv: https://arxiv.org/abs/2203.09775

Sparse Instance Activation for Real-Time Instance Segmentation

intro CVPR 2022
intro: Huazhong University of Science & Technology & Horizon Robotics & CASIA
arxiv: https://arxiv.org/abs/2203.12827
github: https://github.com/hustvl/SparseInst

Human Instance Segmentation

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

intro: Google, Inc.
keywords: Person detection and pose estimation, segmentation and grouping
arxiv: https://arxiv.org/abs/1803.08225

Pose2Seg: Detection Free Human Instance Segmentation

intro: CVPR 2019
intro: Tsinghua Unviersity & BNRist & Tencent AI Lab & Cardiff University
keywords: Occluded Human (OCHuman)
project page: http://www.liruilong.cn/Pose2Seg/index.html
arxiv: https://arxiv.org/abs/1803.10683
github: https://github.com/liruilong940607/Pose2Seg
dataset: https://cg.cs.tsinghua.edu.cn/dataset/form.html?dataset=ochuman

Bounding Box Embedding for Single Shot Person Instance Segmentation

https://arxiv.org/abs/1807.07674

Parsing R-CNN for Instance-Level Human Analysis

intro: COCO 2018 DensePose Challenge Winner
arxiv: https://arxiv.org/abs/1811.12596
github: https://github.com/soeaver/Parsing-R-CNN

Graphonomy: Universal Human Parsing via Graph Transfer Learning

intro: CVPR 2019
arxiv: https://arxiv.org/abs/1904.04536
github: https://github.com/Gaoyiminggithub/Graphonomy

Video Instance Segmentation

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation

intro: ECCV 2020
arxiv: https://arxiv.org/abs/2007.14772
github: https://github.com/JialeCao001/SipMask

End-to-End Video Instance Segmentation with Transformers

intro: Meituan & The University of Adelaide
arxiv: https://arxiv.org/abs/2011.14503

Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation

intro: CVPR 2021
intro: The HongKong Polytechnic University & DAMO Academy, Alibaba Group
arxiv: https://arxiv.org/abs/2104.05606
github: https://github.com/MinghanLi/STMask

Tracking Instances as Queries

intro: HUST & Tencent PCG
arxiv: https://arxiv.org/abs/2106.11963

Video Mask Transfiner for High-Quality Video Instance Segmentation

intro: ECCV 2022
intro: ETH Z¨urich & The Hong Kong University of Science and Technology & Kuaishou Technology
arxiv: https://arxiv.org/abs/2207.14012

Panoptic Segmentation

Panoptic Segmentation

intro: Facebook AI Research (FAIR) & Heidelberg University
arxiv: https://arxiv.org/abs/1801.00868
slides: http://presentations.cocodataset.org/COCO17-Invited-PanopticAlexKirillov.pdf

Panoptic Segmentation with a Joint Semantic and Instance Segmentation Network

https://arxiv.org/abs/1809.02110

Learning to Fuse Things and Stuff

intro: Toyota Research Institute (TRI)
keywords: TASCNet
arxiv: https://arxiv.org/abs/1812.01192

Attention-guided Unified Network for Panoptic Segmentation

intro: CVPR 2019
intro: University of Chinese Academy of Sciences & Horizon Robotics, Inc. & The Johns Hopkins University
arxiv: https://arxiv.org/abs/1812.03904

Panoptic Feature Pyramid Networks

intro: FAIR
arxiv: https://arxiv.org/abs/1901.02446

UPSNet: A Unified Panoptic Segmentation Network

intro: Uber ATG & University of Toronto & The Chinese University of Hong Kong
arxiv: https://arxiv.org/abs/1901.03784

Single Network Panoptic Segmentation for Street Scene Understanding

https://arxiv.org/abs/1902.02678

An End-to-End Network for Panoptic Segmentation

https://arxiv.org/abs/1903.05027

Learning Instance Occlusion for Panoptic Segmentation

https://arxiv.org/abs/1906.05896

SpatialFlow: Bridging All Tasks for Panoptic Segmentation

https://arxiv.org/abs/1910.08787

Single-Shot Panoptic Segmentation

https://arxiv.org/abs/1911.00764

SOGNet: Scene Overlap Graph Network for Panoptic Segmentation

intro: AAAI 2020. Innovation Award in COCO 2019 challenge
arxiv: https://arxiv.org/abs/1911.07527

Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation

intro: UIUC & Google Research
arxiv: https://arxiv.org/abs/1911.10194

PanDA: Panoptic Data Augmentation

https://arxiv.org/abs/1911.12317

Real-Time Panoptic Segmentation from Dense Detections

intro: CVPR 2020 oral
arxiv: https://arxiv.org/abs/1912.01202
github: https://github.com/TRI-ML/realtime_panoptic

Bipartite Conditional Random Fields for Panoptic Segmentation

https://arxiv.org/abs/1912.05307

Unifying Training and Inference for Panoptic Segmentation

https://arxiv.org/abs/2001.04982

Towards Bounding-Box Free Panoptic Segmentation

intro: SLAMcore Ltd. & Imperial College London
arxiv: https://arxiv.org/abs/2002.07705

A Benchmark for LiDAR-based Panoptic Segmentation based on KITTI

project page: http://semantic-kitti.org/
arxiv: https://arxiv.org/abs/2003.02371

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

intro: Johns Hopkins University & Google Research
arxiv: https://arxiv.org/abs/2003.07853

EPSNet: Efficient Panoptic Segmentation Network with Cross-layer Attention Fusion

https://arxiv.org/abs/2003.10142

Pixel Consensus Voting for Panoptic Segmentation

intro: CVPR 2020
arxiv: https://arxiv.org/abs/2004.01849

EfficientPS: Efficient Panoptic Segmentation

arxiv: https://arxiv.org/abs/2004.02307
github: https://github.com/DeepSceneSeg/EfficientPS

Video Panoptic Segmentation

intro: CVPR 2020 Oral
intro: KAIST & Adobe Research
arxiv: https://arxiv.org/abs/2006.11339
github: https://github.com/mcahny/vps

PanoNet: Real-time Panoptic Segmentation through Position-Sensitive Feature Embedding

https://arxiv.org/abs/2008.00192

Robust Vision Challenge 2020 – 1st Place Report for Panoptic Segmentation

https://arxiv.org/abs/2008.10112

Learning Category- and Instance-Aware Pixel Embedding for Fast Panoptic Segmentation

intro: Chinese Academy of Sciences & Horizon Robotics
arxiv: https://arxiv.org/abs/2009.13342

Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation

intro: NeurIPS 2020
intro: Sun Yat-sen University & Huawei Noah’s Ark Lab & DarkMatter AI Research
arxiv: https://arxiv.org/abs/2010.16119
github: https://github.com/Jacobew/AutoPanoptic

Scaling Wide Residual Networks for Panoptic Segmentation

intro: Google Research & Johns Hopkins University
arxiv: https://arxiv.org/abs/2011.11675

Fully Convolutional Networks for Panoptic Segmentation

intro: Chinese University of Hong Kong & University of Oxford & University of Hong Kong & MEGVII Technology4
arxiv: https://arxiv.org/abs/2012.00720
github: https://github.com/yanwei-li/PanopticFCN

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

intro: Johns Hopkins University & Google Research
arxiv: https://arxiv.org/abs/2012.00759

Ada-Segment: Automated Multi-loss Adaptation for Panoptic Segmentation

intro: AAAI 2021
intro: Sun Yat-Sen University & Huawei Noah’s Ark Lab & Shanghai Jiao Tong University
arxiv: https://arxiv.org/abs/2012.03603

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

intro: Johns Hopkins University & Google Research
arxiv: https://arxiv.org/abs/2012.05258
github: https://github.com/joe-siyuan-qiao/ViP-DeepLab

STEP: Segmenting and Tracking Every Pixel

intro: Technical University Munich & Google Research & RWTH Aachen University & MPI-IS and University of Tubingen
arxiv: https://arxiv.org/abs/2102.11859

Cross-View Regularization for Domain Adaptive Panoptic Segmentation

intro: CVPR 2021 oral
arxiv: https://arxiv.org/abs/2103.02584

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

intro: Johns Hopkins University & Google Research
arixv: https://arxiv.org/abs/2012.00759

Panoptic Segmentation Forecasting

intro: CVPR 2021
arxiv: https://arxiv.org/abs/2104.03962

Exemplar-Based Open-Set Panoptic Segmentation Network

intro: CVPR 2021
intro: Seoul National University & Adobe Research
project page: https://cv.snu.ac.kr/research/EOPSN/
arxiv: https://arxiv.org/abs/2105.08336
github: https://github.com/jd730/EOPSN

Hierarchical Lovász Embeddings for Proposal-free Panoptic Segmentation

intro: CVPR 2021
arxiv: https://arxiv.org/abs/2106.04555

Part-aware Panoptic Segmentation

intro: CVPR 2021
arxiv: https://arxiv.org/abs/2106.06351
github: https://github.com/tue-mps/panoptic_parts

Panoptic SegFormer

intro: Nanjing University & The University of Hong Kong & NVIDIA & Caltech
arxiv: https://arxiv.org/abs/2109.03814

Slot-VPS: Object-centric Representation Learning for Video Panoptic Segmentation

intro: Samsung Research China - Beijing (SRC-B) & 2Samsung Advanced Institute of Technology (SAIT) & University of Oxford & The University of Hong Kong
arxiv: https://arxiv.org/abs/2112.08949

CFNet: Learning Correlation Functions for One-Stage Panoptic Segmentation

intro: Zhejiang University & Tencent Youtu Lab & Shanghai Jiao Tong University
arxiv: https://arxiv.org/abs/2201.04796

Panoptic, Instance and Semantic Relations: A Relational Context Encoder to Enhance Panoptic Segmentation

intro: CVPR 2022
intro: Qualcomm AI Research
arxiv: https://arxiv.org/abs/2204.05370

PanopticDepth: A Unified Framework for Depth-aware Panoptic Segmentation

intro: CVPR 2022
intro: Chinese Academy of Sciences & University of Chinese Academy of Sciences & Horizon Robotics, Inc.
arxiv: https://arxiv.org/abs/2206.00468

CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation

intro: CVPR 2022 Oral
intro: Johns Hopkins University & KAIST & Google Research
arxiv: https://arxiv.org/abs/2206.08948

Uncertainty-aware Panoptic Segmentation

intro: Technical University Nurnberg
arxiv: https://arxiv.org/abs/2206.14554

k-means Mask Transformer

intro: ECCV 2022
intro: Johns Hopkins University & Google Research
arxiv: https://arxiv.org/abs/2207.04044
github: https://github.com/google-research/deeplab2

Nightime Segmentation

Nighttime sky/cloud image segmentation

intro: ICIP 2017
arxiv: https://arxiv.org/abs/1705.10583

Dark Model Adaptation: Semantic Image Segmentation from Daytime to Nighttime

intro: International Conference on Intelligent Transportation Systems (ITSC 2018)
arxiv: https://arxiv.org/abs/1810.02575

Semantic Nighttime Image Segmentation with Synthetic Stylized Data, Gradual Adaptation and Uncertainty-Aware Evaluation

Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation

intro: ICCV 2019
intro: ETH Zurich & KU Leuven
arxiv: https://arxiv.org/abs/1901.05946

Bi-Mix: Bidirectional Mixing for Domain Adaptive Nighttime Semantic Segmentation

arxiv: https://arxiv.org/abs/2111.10339
github: https://github.com/ygjwd12345/BiMix

DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

intro: CVPR 2021 oral
intro: University of South Carolina & Farsee2 Technology Ltd
arxiv: https://arxiv.org/abs/2104.10834
github: https://github.com/W-zx-Y/DANNet

NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night

intro: CVPR 2022
arxiv: https://arxiv.org/abs/2204.05538
github: https://github.com/xdeng7/NightLab

Boosting Night-time Scene Parsing with Learnable Frequency

intro: Shanghai University & City University of Hong Kong & East China Normal University & Shanghai Jiao Tong University
arxiv: https://arxiv.org/abs/2208.14241

Face Parsing

Face Parsing via Recurrent Propagation

intro: BMVC 2017
arxiv: https://arxiv.org/abs/1708.01936

Face Parsing via a Fully-Convolutional Continuous CRF Neural Network

https://arxiv.org/abs/1708.03736

Face Parsing with RoI Tanh-Warping

intro: Software School of Xiamen University & Microsoft Research
arxiv: https://arxiv.org/abs/1906.01342

End-to-End Face Parsing via Interlinked Convolutional Neural Networks

https://arxiv.org/abs/2002.04831

RoI Tanh-polar Transformer Network for Face Parsing in the Wild

arxiv: https://arxiv.org/abs/2102.02717
code: https://ibug.doc.ic.ac.uk/resources/ibugmask/

Decoupled Multi-task Learning with Cyclical Self-Regulation for Face Parsing

intro: CVPR 2022
arxiv: https://arxiv.org/abs/2203.14448
github: https://github.com/deepinsight/insightface/tree/master/parsing/dml_csr

Specific Segmentation

A CNN Cascade for Landmark Guided Semantic Part Segmentation

project page: http://aaronsplace.co.uk/
paper: https://aaronsplace.co.uk/papers/jackson2016guided/jackson2016guided.pdf

End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks

arxiv: https://arxiv.org/abs/1703.03305

Boundary-sensitive Network for Portrait Segmentation

https://arxiv.org/abs/1712.08675

Boundary-Aware Network for Fast and High-Accuracy Portrait Segmentation

intro: Zhejiang University
arxiv: https://arxiv.org/abs/1901.03814

Beef Cattle Instance Segmentation Using Fully Convolutional Neural Network

intro: BMVC 2018
arxiv: https://arxiv.org/abs/1807.01972

Face Mask Extraction in Video Sequence

keywords: ConvLSTM & FCN
arxiv: https://arxiv.org/abs/1807.09207

Segment Proposal

Learning to Segment Object Candidates

intro: Facebook AI Research (FAIR)
intro: DeepMask. learning segmentation proposals
arxiv: http://arxiv.org/abs/1506.06204
github: https://github.com/facebookresearch/deepmask
github: https://github.com/abbypa/NNProject_DeepMask

Learning to Refine Object Segments

intro: ECCV 2016. Facebook AI Research (FAIR)
intro: SharpMask. an extension of DeepMask which generates higher-fidelity masks using an additional top-down refinement step.
arxiv: http://arxiv.org/abs/1603.08695
github: https://github.com/facebookresearch/deepmask

FastMask: Segment Object Multi-scale Candidates in One Shot

intro: CVPR 2017. University of California & Fudan University & Megvii Inc.
arxiv: https://arxiv.org/abs/1612.08843
github: https://github.com/voidrank/FastMask

Scene Labeling / Scene Parsing

Indoor Semantic Segmentation using depth information

arxiv: http://arxiv.org/abs/1301.3572

Recurrent Convolutional Neural Networks for Scene Parsing

arxiv: http://arxiv.org/abs/1306.2795
slides: http://people.ee.duke.edu/~lcarin/Yizhe8.14.2015.pdf
github: https://github.com/NP-coder/CLPS1520Project
github: https://github.com/rkargon/Scene-Labeling

Learning hierarchical features for scene labeling

paper: http://yann.lecun.com/exdb/publis/pdf/farabet-pami-13.pdf

Multi-modal unsupervised feature learning for rgb-d scene labeling

intro: ECCV 2014
paper: http://www3.ntu.edu.sg/home/wanggang/WangECCV2014.pdf

Scene Labeling with LSTM Recurrent Neural Networks

paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Byeon_Scene_Labeling_With_2015_CVPR_paper.pdf

Attend, Infer, Repeat: Fast Scene Understanding with Generative Models

“Semantic Segmentation for Scene Understanding: Algorithms and Implementations” tutorial

intro: 2016 Embedded Vision Summit
youtube: https://www.youtube.com/watch?v=pQ318oCGJGY

Semantic Understanding of Scenes through the ADE20K Dataset

arxiv: https://arxiv.org/abs/1608.05442

Learning Deep Representations for Scene Labeling with Guided Supervision

Learning Deep Representations for Scene Labeling with Semantic Context Guided Supervision

intro: CUHK
arxiv: https://arxiv.org/abs/1706.02493

Spatial As Deep: Spatial CNN for Traffic Scene Understanding

intro: AAAI 2018
arxiv: https://arxiv.org/abs/1712.06080

Multi-Path Feedback Recurrent Neural Network for Scene Parsing

arxiv: http://arxiv.org/abs/1608.07706

Scene Labeling using Recurrent Neural Networks with Explicit Long Range Contextual Dependency

arxiv: https://arxiv.org/abs/1611.07485

FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation

intro: CVPR 2022
arxiv: https://arxiv.org/abs/2204.01587

PSPNet

Pyramid Scene Parsing Network

intro: CVPR 2017
intro: mIoU score as 85.4% on PASCAL VOC 2012 and 80.2% on Cityscapes, ranked 1st place in ImageNet Scene Parsing Challenge 2016
project page: http://appsrv.cse.cuhk.edu.hk/~hszhao/projects/pspnet/index.html
arxiv: https://arxiv.org/abs/1612.01105
slides: http://image-net.org/challenges/talks/2016/SenseCUSceneParsing.pdf
github: https://github.com/hszhao/PSPNet
github: https://github.com/Vladkryvoruchko/PSPNet-Keras-tensorflow

Open Vocabulary Scene Parsing

https://arxiv.org/abs/1703.08769

Deep Contextual Recurrent Residual Networks for Scene Labeling

https://arxiv.org/abs/1704.03594

Fast Scene Understanding for Autonomous Driving

intro: Published at “Deep Learning for Vehicle Perception”, workshop at the IEEE Symposium on Intelligent Vehicles 2017
arxiv: https://arxiv.org/abs/1708.02550

FoveaNet: Perspective-aware Urban Scene Parsing

https://arxiv.org/abs/1708.02421

BlitzNet: A Real-Time Deep Network for Scene Understanding

intro: INRIA
arxiv: https://arxiv.org/abs/1708.02813

Semantic Foggy Scene Understanding with Synthetic Data

https://arxiv.org/abs/1708.07819

Scale-adaptive Convolutions for Scene Parsing

intro: ICCV 2017
paper: http://openaccess.thecvf.com/content_ICCV_2017/papers/Zhang_Scale-Adaptive_Convolutions_for_ICCV_2017_paper.pdf

Restricted Deformable Convolution based Road Scene Semantic Segmentation Using Surround View Cameras

https://arxiv.org/abs/1801.00708

Dense Recurrent Neural Networks for Scene Labeling

https://arxiv.org/abs/1801.06831

DenseASPP for Semantic Segmentation in Street Scenes

intro: CVPR 2018
paper: http://openaccess.thecvf.com/content_cvpr_2018/papers/Yang_DenseASPP_for_Semantic_CVPR_2018_paper.pdf
github: https://github.com/DeepMotionAIResearch/DenseASPP

OCNet: Object Context Network for Scene Parsing

intro: Microsoft Research
arxiv: https://arxiv.org/abs/1809.00916
github: https://github.com/PkuRainBow/OCNet

PSANet: Point-wise Spatial Attention Network for Scene Parsing

intro: ECCV 2018
project page: https://hszhao.github.io/projects/psanet/
paper: https://hszhao.github.io/papers/eccv18_psanet.pdf
slides: https://docs.google.com/presentation/d/1_brKNBtv8nVu_jOwFRGwVkEPAq8B8hEngBSQuZCWaZA/edit#slide=id.p
github: https://github.com/hszhao/PSANet

Adaptive Context Network for Scene Parsing

intro: ICCV 2019
arxiv: https://arxiv.org/abs/1911.01664

Semantic Flow for Fast and Accurate Scene Parsing

intro: ECCV 2020 oral
arxiv: https://arxiv.org/abs/2002.10120
github: https://github.com/donnyyou/torchcv

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

intro: CVPR 2020
arxiv: https://arxiv.org/abs/2003.13328
github: https://github.com/Andrew-Qibin/SPNet

S3-Net: A Fast and Lightweight Video Scene Understanding Network by Single-shot Segmentation

intro: WACV 2021
arxiv: https://arxiv.org/abs/2011.02265

Benchmarks

MIT Scene Parsing Benchmark

homepage: http://sceneparsing.csail.mit.edu/
github(devkit): https://github.com/CSAILVision/sceneparsing

Semantic Understanding of Urban Street Scenes: Benchmark Suite

https://www.cityscapes-dataset.com/benchmarks/

Challenges

Large-scale Scene Understanding Challenge

homepage: http://lsun.cs.princeton.edu/

Places2 Challenge

http://places2.csail.mit.edu/challenge.html

Human Parsing

Human Parsing with Contextualized Convolutional Neural Network

intro: ICCV 2015
paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/html/Liang_Human_Parsing_With_ICCV_2015_paper.html

Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing

intro: CVPR 2017. SYSU & CMU
keywords: Look Into Person (LIP)
project page: http://hcp.sysu.edu.cn/lip/
arxiv: https://arxiv.org/abs/1703.05446
github: https://github.com/Engineering-Course/LIP_SSL

Multiple-Human Parsing in the Wild

https://arxiv.org/abs/1705.07206

Look into Person: Joint Body Parsing & Pose Estimation Network and A New Benchmark

intro: T-PAMI 2018
keywords: Joint Body Parsing & Pose Estimation Network (JPPNet)
arxiv: https://arxiv.org/abs/1804.01984
github: https://github.com/Engineering-Course/LIP_JPPNet

Cross-domain Human Parsing via Adversarial Feature and Label Adaptation

intro: AAAI 2018
arxiv: https://arxiv.org/abs/1801.01260

Fusing Hierarchical Convolutional Features for Human Body Segmentation and Clothing Fashion Classification

intro: Wuhan University
arxiv: https://arxiv.org/abs/1803.03415

Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing

arxiv: https://arxiv.org/abs/1804.03287
github: https://github.com/ZhaoJ9014/Multi-Human-Parsing

Macro-Micro Adversarial Network for Human Parsing

intro: ECCV 2018
keywords: Macro-Micro Adversarial Net (MMAN)
arxiv: https://arxiv.org/abs/1807.08260
github: https://github.com/RoyalVane/MMAN

Instance-level Human Parsing via Part Grouping Network

intro: ECCV 2018 Oral
arxiv: https://arxiv.org/abs/1808.00157

Adaptive Temporal Encoding Network for Video Instance-level Human Parsing

intro: ACM MM 2018 = arixv: https://arxiv.org/abs/1808.00661
github(official, TensorFlow): https://github.com/HCPLab-SYSU/ATEN

Devil in the Details: Towards Accurate Single and Multiple Human Parsing

keywords: Context Embedding with Edge Perceiving (CE2P)
arxiv: https://arxiv.org/abs/1809.05996
github: https://github.com/liutinglt/CE2P

Cross-Domain Complementary Learning with Synthetic Data for Multi-Person Part Segmentation

intro: University of Washington & Microsof
arxiv: https://arxiv.org/abs/1907.05193

Self-Correction for Human Parsing

arxiv: https://arxiv.org/abs/1910.09777
github: https://github.com/PeikeLi/Self-Correction-Human-Parsing

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing

intro: AAAI 2020
arxiv: https://arxiv.org/abs/1911.12053
github: https://github.com/Charleshhy/Grapy-ML

Learning Semantic Neural Tree for Human Parsing

intro: Institute of Software Chinese Academy of Sciences & State University of New York & JD Finance America Corporation & Tencent Youtu Lab
arxiv: https://arxiv.org/abs/1912.09622
code: https://isrc.iscas.ac.cn/gitlab/research/sematree

Self-Learning with Rectification Strategy for Human Parsing

intro: CVPR 2020
arxiv: https://arxiv.org/abs/2004.08055

Correlating Edge, Pose with Parsing

intro: CVPR 2020
arxiv: https://arxiv.org/abs/2005.01431
github: https://github.com/ziwei-zh/CorrPM

Affinity-aware Compression and Expansion Network for Human Parsing

https://arxiv.org/abs/2008.10191

Renovating Parsing R-CNN for Accurate Multiple Human Parsing

intro: ECCV 2020
intro: BUPT & Noah’s Ark Lab, Huawei Technologies
arxiv: https://arxiv.org/abs/2009.09447
github: https://github.com/soeaver/RP-R-CNN

Progressive One-shot Human Parsing

intro: AAAI 2021
arxiv: https://arxiv.org/abs/2012.11810
github: https://github.com/Charleshhy/One-shot-Human-Parsing

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

intro: CVPR 2021 oral
arxiv: https://arxiv.org/abs/2103.04570
github: https://github.com/tfzhou/MG-HumanParsing

Quality-Aware Network for Human Parsing

intro: BUPT & Institute of Automation Chinese Academy of Sciences & 3Noah’s Ark Lab
arxiv: https://arxiv.org/abs/2103.05997
github(Pytorch): https://github.com/soeaver/QANet

End-to-end One-shot Human Parsing

https://arxiv.org/abs/2105.01241

CDGNet: Class Distribution Guided Network for Human Parsing

intro: Ajou University & Tiangong University & Incheon National University
arxiv: https://arxiv.org/abs/2111.14173

AIParsing: Anchor-free Instance-level Human Parsing

intro: IEEE Transactions on Image Processing (TIP)
arxiv: https://arxiv.org/abs/2207.06854

RepParser: End-to-End Multiple Human Parsing with Representative Parts

intro: Center for Future Media & University of Electronic Science and Technology of China
arxiv: https://arxiv.org/abs/2208.12908

Joint Detection and Segmentation

Triply Supervised Decoder Networks for Joint Detection and Segmentation

https://arxiv.org/abs/1809.09299

D2Det: Towards High Quality Object Detection and Instance Segmentation

intro: CVPR 2020
paper: https://openaccess.thecvf.com/content_CVPR_2020/papers/Cao_D2Det_Towards_High_Quality_Object_Detection_and_Instance_Segmentation_CVPR_2020_paper.pdf
github: https://github.com/JialeCao001/D2Det

Video Object Segmentation

Fast object segmentation in unconstrained video

Recurrent Fully Convolutional Networks for Video Segmentation

arxiv: https://arxiv.org/abs/1606.00487

Object Detection, Tracking, and Motion Segmentation for Object-level Video Segmentation

arxiv: http://arxiv.org/abs/1608.03066

Clockwork Convnets for Video Semantic Segmentation

intro: ECCV 2016 Workshops
intro: evaluated on the Youtube-Objects, NYUD, and Cityscapes video datasets
arxiv: http://arxiv.org/abs/1608.03609
github: https://github.com/shelhamer/clockwork-fcn

STFCN: Spatio-Temporal FCN for Semantic Video Segmentation

arxiv: http://arxiv.org/abs/1608.05971

One-Shot Video Object Segmentation

intro: OSVOS
project: http://www.vision.ee.ethz.ch/~cvlsegmentation/osvos/
arxiv: https://arxiv.org/abs/1611.05198
github(official): https://github.com/kmaninis/OSVOS-caffe
github(official): https://github.com/scaelles/OSVOS-TensorFlow
github(official): https://github.com/kmaninis/OSVOS-PyTorch

DAVIS: Densely Annotated VIdeo Segmentation

homepage: http://davischallenge.org/
arxiv: https://arxiv.org/abs/1704.00675

Video Object Segmentation Without Temporal Information

https://arxiv.org/abs/1709.06031

Convolutional Gated Recurrent Networks for Video Segmentation

arxiv: https://arxiv.org/abs/1611.05435

Learning Video Object Segmentation from Static Images

arxiv: https://arxiv.org/abs/1612.02646

Semantic Video Segmentation by Gated Recurrent Flow Propagation

arxiv: https://arxiv.org/abs/1612.08871

FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos

project page: http://vision.cs.utexas.edu/projects/fusionseg/
arxiv: https://arxiv.org/abs/1701.05384
github: https://github.com/suyogduttjain/fusionseg

Unsupervised learning from video to detect foreground objects in single images

https://arxiv.org/abs/1703.10901

Semantically-Guided Video Object Segmentation

https://arxiv.org/abs/1704.01926

Learning Video Object Segmentation with Visual Memory

https://arxiv.org/abs/1704.05737

Flow-free Video Object Segmentation

https://arxiv.org/abs/1706.09544

Online Adaptation of Convolutional Neural Networks for Video Object Segmentation

https://arxiv.org/abs/1706.09364

Video Object Segmentation using Tracked Object Proposals

intro: CVPR-2017 workshop, DAVIS-2017 Challenge
arxiv: https://arxiv.org/abs/1707.06545

Video Object Segmentation with Re-identification

intro: CVPR 2017 Workshop, DAVIS Challenge on Video Object Segmentation 2017 (Winning Entry)
arxiv: https://arxiv.org/abs/1708.00197
github(official, PyTorch): https://github.com/lxx1991/VS-ReID

Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1708.05137

MaskRNN: Instance Level Video Object Segmentation

intro: NIPS 2017
arxiv: https://arxiv.org/abs/1803.11187

SegFlow: Joint Learning for Video Object Segmentation and Optical Flow

project page: https://sites.google.com/site/yihsuantsai/research/iccv17-segflow
arxiv: https://arxiv.org/abs/1709.06750
github: https://github.com/JingchunCheng/SegFlow

Video Semantic Object Segmentation by Self-Adaptation of DCNN

https://arxiv.org/abs/1711.08180

Learning to Segment Moving Objects

https://arxiv.org/abs/1712.01127

Instance Embedding Transfer to Unsupervised Video Object Segmentation

intro: University of Southern California & Google Inc
arxiv: https://arxiv.org/abs/1801.00908
blog: https://medium.com/@barvinograd1/instance-embedding-instance-segmentation-without-proposals-31946a7c53e1

Efficient Video Object Segmentation via Network Modulation

intro: Snap Inc. & Northwestern University & Google Inc.
arxiv: https://arxiv.org/abs/1802.01218

Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation

intro: ECCV 2018
intro: CUHK
keywords: DyeNet
arxiv: https://arxiv.org/abs/1803.04242

Video Object Segmentation with Language Referring Expressions

https://arxiv.org/abs/1803.08006

Dynamic Video Segmentation Network

intro: CVPR 2018
keywords: DVSNet
arxiv: https://arxiv.org/abs/1804.00931
github: https://github.com/XUSean0118/DVSNet

Low-Latency Video Semantic Segmentation

intro: CVPR 2018 Spotlight
arxiv: https://arxiv.org/abs/1804.00389

Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1804.03131

Unsupervised Video Object Segmentation for Deep Reinforcement Learning

intro: University of Waterloo
arxiv: https://arxiv.org/abs/1805.07780

Fast and Accurate Online Video Object Segmentation via Tracking Parts

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1806.02323
github: https://github.com/JingchunCheng/FAVOS

ReConvNet: Video Object Segmentation with Spatio-Temporal Features Modulation

intro: CVPR Workshop - DAVIS Challenge 2018
arxiv: https://arxiv.org/abs/1806.05510

Deep Spatio-Temporal Random Fields for Efficient Video Segmentation

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1807.03148

Fast Video Object Segmentation by Reference-Guided Mask Propagation

intro: CVPR 2018
paper: http://openaccess.thecvf.com/content_cvpr_2018/CameraReady/1029.pdf
github: https://github.com/seoungwugoh/RGMP

PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation

https://arxiv.org/abs/1807.09190

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

intro: ECCV 2018. Adobe Research & Snapchat Research & UIUC
project page:https://youtube-vos.org/
arxiv: https://arxiv.org/abs/1809.00461

VideoMatch: Matching based Video Object Segmentation

intro: ECCV 2018
arxiv: https://arxiv.org/abs/1809.01123

Mask Propagation Network for Video Object Segmentation

intro: ByteDance AI Lab
arxiv: https://arxiv.org/abs/1810.10289

Tukey-Inspired Video Object Segmentation

https://arxiv.org/abs/1811.07958

A Generative Appearance Model for End-to-end Video Object Segmentation

https://arxiv.org/abs/1811.11611

Unseen Object Segmentation in Videos via Transferable Representations

intro: ACCV 2018 oral
arxiv: https://arxiv.org/abs/1901.02444
github: https://github.com/wenz116/TransferSeg

FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation

intro: CVPR 2019
intro: RWTH Aachen University & Google Inc.
arxiv: https://arxiv.org/abs/1902.09513

RVOS: End-to-End Recurrent Network for Video Object Segmentation

intro: CVPR 2019
project page: https://imatge-upc.github.io/rvos/
arxiv: https://arxiv.org/abs/1903.05612

BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames

intro: CVPR 2019
intro: University of Michigan
arxiv: https://arxiv.org/abs/1903.11779
github: https://github.com/griffbr/BubbleNets
video: https://www.youtube.com/watch?v=0kNmm8SBnnU&feature=youtu.be

Fast video object segmentation with Spatio-Temporal GANs

https://arxiv.org/abs/1903.12161

Video Object Segmentation using Space-Time Memory Networks

intro: ICCV 2019
intro: Yonsei University & Adobe Research
arxiv: https://arxiv.org/abs/1904.00607
github: https://github.com/seoungwugoh/STM

Spatiotemporal CNN for Video Object Segmentation

[https://arxiv.org/abs/1904.02363]

Architecture Search of Dynamic Cells for Semantic Video Segmentation

https://arxiv.org/abs/1904.02371

BoLTVOS: Box-Level Tracking for Video Object Segmentation

https://arxiv.org/abs/1904.04552

MAIN: Multi-Attention Instance Network for Video Segmentation

https://arxiv.org/abs/1904.05847

MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation

intro: CVPR 2019 oral
arxiv: https://arxiv.org/abs/1904.08141

Video Instance Segmentation

intro: ICCV 2019
intro: ByteDance AI Lab & UIUC & Adobe Research
keywords: MaskTrack R-CNN
arxiv: https://arxiv.org/abs/1905.04804
github: https://github.com/youtubevos/MaskTrackRCNN

OVSNet : Towards One-Pass Real-Time Video Object Segmentation

intro: Zhejiang University & SenseTime Research & Tianjin University]
arxiv: https://arxiv.org/abs/1905.10064

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

intro: Huazhong University of Science and Technology & Horizon Robotics
arxiv: https://arxiv.org/abs/1907.01203
github: https://github.com/sydney0zq/PTSNet

RANet: Ranking Attention Network for Fast Video Object Segmentation

intro: ICCV 2019
arxiv: https://arxiv.org/abs/1908.06647
github: https://github.com/Storife/RANet

DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation

intro: ICCV 2019
arxiv: https://arxiv.org/abs/1909.12471

CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing

intro: ICCV 2019
arxiv: https://arxiv.org/abs/1910.00132

Towards Good Practices for Video Object Segmentation

intro: ByteDance AI Lab
arxiv: https://arxiv.org/abs/1909.13583

Anchor Diffusion for Unsupervised Video Object Segmentation

intro: ICCV 2019
arxiv: https://arxiv.org/abs/1910.10895

Learning a Spatio-Temporal Embedding for Video Instance Segmentation

intro: University of Cambridge
arxiv: https://arxiv.org/abs/1912.08969

Efficient Semantic Video Segmentation with Per-frame Inference

intro: ECCV 2020
intro: The University of Adelaide & Huazhong University of Science and Technology & Microsoft Research
arxiv: https://arxiv.org/abs/2002.11433
github: https://github.com/irfanICMLL/ETC-Real-time-Per-frame-Semantic-video-segmentation

State-Aware Tracker for Real-Time Video Object Segmentation

intro: CVPR 2020
arxiv: https://arxiv.org/abs/2003.00482
github: https://github.com/MegviiDetection/video_analyst

Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement

intro: NeurIPS 2020
arxiv: https://arxiv.org/abs/2010.07958

SwiftNet: Real-time Video Object Segmentation

https://arxiv.org/abs/2102.04604

SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation

https://arxiv.org/abs/2103.10284

Challenge

DAVIS Challenge on Video Object Segmentation 2017

http://davischallenge.org/challenge2017/publications.html

Matting

Deep Image Matting

intro: CVPR 2017
intro: Beckman Institute for Advanced Science and Technology & Adobe Research
project page: https://sites.google.com/view/deepimagematting
arxiv: https://arxiv.org/abs/1703.03872
github(unofficial): https://github.com/open-mmlab/mmediting/tree/master/configs/mattors/dim
github(unofficial): https://github.com/foamliu/Deep-Image-Matting
github(unofficial): https://github.com/foamliu/Deep-Image-Matting-PyTorch
github(unofficial): https://github.com/huochaitiantang/pytorch-deep-image-matting

Fast Deep Matting for Portrait Animation on Mobile Phone

intro: ACM Multimedia Conference (MM) 2017
intro: does not need any interaction and can realize real-time matting with 15 fps
arxiv: https://arxiv.org/abs/1707.08289

Real-time deep hair matting on mobile devices

intro: ModiFace Inc, University of Toronto
arxiv: https://arxiv.org/abs/1712.07168

TOM-Net: Learning Transparent Object Matting from a Single Image

intro: CVPR 2018
project page: http://gychen.org/TOM-Net/
arxiv: https://arxiv.org/abs/1803.04636
github: https://github.com/guanyingc/TOM-Net

Deep Video Portraits

intro: SIGGRAPH 2018
arxiv: https://arxiv.org/abs/1805.11714
youtube: https://www.youtube.com/watch?v=qc5P2bvfl44

Inductive Guided Filter: Real-time Deep Image Matting with Weakly Annotated Masks on Mobile Devices

intro: Shanghai Jiao Tong University & Versa
arxiv: https://arxiv.org/abs/1905.06747

Indices Matter: Learning to Index for Deep Image Matting

intro: ICCV 2019
arxiv: https://arxiv.org/abs/1908.00672
github(official): https://github.com/poppinace/indexnet_matting
github: https://github.com/open-mmlab/mmediting/tree/master/configs/mattors/indexnet

Disentangled Image Matting

https://arxiv.org/abs/1909.04686

Natural Image Matting via Guided Contextual Attention

intro: AAAI 2020
arxiv: https://arxiv.org/abs/2001.04069
github: https://github.com/Yaoyi-Li/GCA-Matting

F, B, Alpha Matting

intro: ECCV 2020
arxiv: https://arxiv.org/abs/2003.07711
github: https://github.com/MarcoForte/FBA_Matting

Background Matting: The World is Your Green Screen

intro: CVPR 2020
intro: University of Washington
project page: https://grail.cs.washington.edu/projects/background-matting/
arxiv: https://arxiv.org/abs/2004.00626
github: https://github.com/senguptaumd/Background-Matting
blog: https://towardsdatascience.com/background-matting-the-world-is-your-green-screen-83a3c4f0f635

Hierarchical Opacity Propagation for Image Matting

intro: Shanghai Jiao Tong University
arxiv: https://arxiv.org/abs/2004.03249
github: https://github.com/Yaoyi-Li/HOP-Matting

High-Resolution Deep Image Matting

intro: UIUC & Adobe Research & University of Oregon
arxiv: https://arxiv.org/abs/2009.06613

Learning Affinity-Aware Upsampling for Deep Image Matting

intro: The University of Adelaide & Huazhong University of Science and Technology
arxiv: https://arxiv.org/abs/2011.14288

Real-Time High-Resolution Background Matting

project page: https://grail.cs.washington.edu/projects/background-matting-v2/
arxiv: https://arxiv.org/abs/2012.07810
github: https://github.com/PeterL1n/BackgroundMattingV2

Deep Video Matting via Spatio-Temporal Alignment and Aggregation

Trimap-guided Feature Mining and Fusion Network for Natural Image Matting

intro: Shanghai Jiao Tong University & ByteDance Inc.
arxiv: https://arxiv.org/abs/2112.00510

Boosting Robustness of Image Matting with Context Assembling and Strong Data Augmentation

intro: The University of Adelaide & Adobe Inc. & Zhejiang University
arxiv: https://arxiv.org/abs/2201.06889

MatteFormer: Transformer-Based Image Matting via Prior-Tokens

intro: Seoul National University & NAVER WEBTOON AI
arxiv: https://arxiv.org/abs/2203.15662

Referring Image Matting

intro: The University of Sydney & JD Explore Academy
arxiv: https://arxiv.org/abs/2206.05149
github: https://github.com/JizhiziLi/RIM

One-Trimap Video Matting

TransMatting: Enhancing Transparent Objects Matting with Transformers

intro: ECCV 2022
arxiv: https://arxiv.org/abs/2208.03007
github: https://github.com/AceCHQ/TransMatting

trimap-free matting

Semantic Human Matting

intro: ACM Multimedia 2018
arxiv: https://arxiv.org/abs/1809.01354
github(unofficial): https://github.com/lizhengwei1992/Semantic_Human_Matting

Instance Segmentation based Semantic Matting for Compositing Applications

intro: CRV 2019
arxiv: https://arxiv.org/abs/1904.05457

A Late Fusion CNN for Digital Matting

intro: CVPR 2019
intro: Zhejiang University & Alibaba Group & University of Texas at Austin
paper: https://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_A_Late_Fusion_CNN_for_Digital_Matting_CVPR_2019_paper.pdf
github(official, Keras): https://github.com/yunkezhang/FusionMatting

Attention-Guided Hierarchical Structure Aggregation for Image Matting

intro: CVPR 2020
project page: https://wukaoliu.github.io/HAttMatting/
paper: https://openaccess.thecvf.com/content_CVPR_2020/papers/Qiao_Attention-Guided_Hierarchical_Structure_Aggregation_for_Image_Matting_CVPR_2020_paper.pdf
github: https://github.com/wukaoliu/CVPR2020-HAttMatting

Boosting Semantic Human Matting with Coarse Annotations

intro: Alibaba Group & Tsinghua University
arxiv: https://arxiv.org/abs/2004.04955

End-to-end Animal Image Matting

keywords: Glance and Focus Matting network (GFM), AM-2k dataset, BG-20k dataset
arxiv: https://arxiv.org/abs/2010.16188
github: https://github.com/JizhiziLi/animal-matting/

Is a Green Screen Really Necessary for Real-Time Human Matting?

intro: City University of Hong Kong & SenseTime Research
arxiv: https://arxiv.org/abs/2011.11961
github: https://github.com/ZHKKKe/MODNet

Multi-scale Information Assembly for Image Matting

https://arxiv.org/abs/2101.02391

Salient Image Matting

intro: Fynd & University of Michigan
arxiv: https://arxiv.org/abs/2103.12337

Mask Guided Matting via Progressive Refinement Network

intro: CVPR 2021
intro: The Johns Hopkins University & Adobe
arxiv: https://arxiv.org/abs/2012.06722
github: https://github.com/yucornetto/MGMatting

Privacy-Preserving Portrait Matting

intro: The University of Sydney & JD Explore Academy
arxiv: https://arxiv.org/abs/2104.14222
github: https://github.com/SHI-Labs/Pseudo-IoU-for-Anchor-Free-Object-Detection

Highly Efficient Natural Image Matting

intro: BMVC 2021
arxiv: https://arxiv.org/abs/2110.12748

PP-HumanSeg: Connectivity-Aware Portrait Segmentation with a Large-Scale Teleconferencing Video Dataset

intro: WACV 2021 workshop
intro: Baidu, Inc.
arxiv: https://arxiv.org/abs/2112.07146
github: https://github.com/PaddlePaddle/PaddleSeg

Situational Perception Guided Image Matting

intro: OPPO Research Institute & PicUp.AI & Xmotors
arxiv: https://arxiv.org/abs/2204.09276

PP-Matting: High-Accuracy Natural Image Matting

intro: Baidu Inc.
arixv: https://arxiv.org/abs/2204.09433
github: https://github.com/PaddlePaddle/PaddleSeg

VMFormer: End-to-End Video Matting with Transformer

project page: https://chrisjuniorli.github.io/project/VMFormer/
intro: University of Oregon & UIUC & BJTU & Picsart AI Research (PAIR)
arxiv: https://arxiv.org/abs/2208.12801
gihtub: https://github.com/SHI-Labs/VMFormer

3D Segmentation

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

intro: Stanford University
project page: http://stanford.edu/~rqi/pointnet/
arxiv: https://arxiv.org/abs/1612.00593
github: https://github.com/charlesq34/pointnet

DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks

https://arxiv.org/abs/1703.03098

SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud

intro: UC Berkeley
arxiv: https://arxiv.org/abs/1710.07368

SEGCloud: Semantic Segmentation of 3D Point Clouds

intro: International Conference of 3D Vision (3DV) 2017 (Spotlight). Stanford University
homepage: http://segcloud.stanford.edu/
arxiv: https://arxiv.org/abs/1710.07563

3D Instance Segmentation via Multi-task Metric Learning

intro: KAUST & ETH Zurich
arxiv: https://arxiv.org/abs/1906.08650

3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation

intro: RWTH Aachen University & Google & Technical University Munich
project page: https://www.vision.rwth-aachen.de/publication/00199/
arxiv: https://arxiv.org/abs/2003.13867

PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation

intro: CVPR 2020
arxiv: https://arxiv.org/abs/2004.01658

Line Parsing

Fully Convolutional Line Parsing

intro: ICCV 2021
intro: UESTC & UC Berkeley
arxiv: https://arxiv.org/abs/2104.11207
github(PyTorch): https://github.com/Delay-Xili/F-Clip

Projects

TF Image Segmentation: Image Segmentation framework

intro: Image Segmentation framework based on Tensorflow and TF-Slim library
github: https://github.com/warmspringwinds/tf-image-segmentation

KittiSeg: A Kitti Road Segmentation model implemented in tensorflow.

keywords: MultiNet
intro: KittiSeg performs segmentation of roads by utilizing an FCN based model.
github: https://github.com/MarvinTeichmann/KittiBox

Semantic Segmentation Architectures Implemented in PyTorch

intro: Segnet/FCN/U-Net/Link-Net
github: https://github.com/meetshah1995/pytorch-semseg

PyTorch for Semantic Segmentation

https://github.com/ZijunDeng/pytorch-semantic-segmentation

LightNet: Light-weight Networks for Semantic Image Segmentation

project page: https://ansleliu.github.io/LightNet.html
github: https://github.com/ansleliu/LightNet

LightNet++: Boosted Light-weighted Networks for Real-time Semantic Segmentation

project page: https://ansleliu.github.io/LightNet.html
github: https://github.com/ansleliu/LightNetPlusPlus

Leaderboard

Segmentation Results: VOC2012 BETA: Competition “comp6” (train on own data)

http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?cls=mean&challengeid=11&compid=6

Blogs

Mobile Real-time Video Segmentation

https://research.googleblog.com/2018/03/mobile-real-time-video-segmentation.html

Deep Learning for Natural Image Segmentation Priors

http://cs.brown.edu/courses/csci2951-t/finals/ghope/

Image Segmentation Using DIGITS 5

https://devblogs.nvidia.com/parallelforall/image-segmentation-using-digits-5/

Image Segmentation with Tensorflow using CNNs and Conditional Random Fields http://warmspringwinds.github.io/tensorflow/tf-slim/2016/12/18/image-segmentation-with-tensorflow-using-cnns-and-conditional-random-fields/

Fully Convolutional Networks (FCNs) for Image Segmentation

Image segmentation with Neural Net

A 2017 Guide to Semantic Segmentation with Deep Learning

http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review

Tutorails / Talks

A Unified Architecture for Instance and Semantic Segmentation

intro: FPN
slides: http://presentations.cocodataset.org/COCO17-Stuff-FAIR.pdf

Deep learning for image segmentation

intro: PyData Warsaw - Mateusz Opala & Michał Jamroż
youtube: https://www.youtube.com/watch?v=W6r_a5crqGI