Deep Learning Applications
Applications
DeepFix: A Fully Convolutional Neural Network for predicting Human Eye Fixations
Some like it hot - visual guidance for preference prediction
- arxiv: http://arxiv.org/abs/1510.07867
- demo: http://howhot.io/
Deep Learning Algorithms with Applications to Video Analytics for A Smart City: A Survey
Deep Relative Attributes
- intro: ACCV 2016
- arxiv: http://arxiv.org/abs/1512.04103
- github: https://github.com/yassersouri/ghiaseddin
Deep-Spying: Spying using Smartwatch and Deep Learning
Camera identification with deep convolutional networks
- key word: copyright infringement cases, ownership attribution
- arxiv: http://arxiv.org/abs/1603.01068
An Analysis of Deep Neural Network Models for Practical Applications
8 Inspirational Applications of Deep Learning
- intro: Colorization of Black and White Images, Adding Sounds To Silent Movies, Automatic Machine Translation Object Classification in Photographs, Automatic Handwriting Generation, Character Text Generation, Image Caption Generation, Automatic Game Playing
- blog: http://machinelearningmastery.com/inspirational-applications-deep-learning/
16 Open Source Deep Learning Models Running as Microservices
- intro: Places 365 Classifier, Deep Face Recognition, Real Estate Classifier, Colorful Image Colorization, Illustration Tagger, InceptionNet, Parsey McParseface, ArtsyNetworks
- blog: http://blog.algorithmia.com/2016/07/open-source-deep-learning-algorithm-roundup/
Deep Cascaded Bi-Network for Face Hallucination
- project page: http://mmlab.ie.cuhk.edu.hk/projects/CBN.html
- arxiv: http://arxiv.org/abs/1607.05046
DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation
- project page: http://yaroslav.ganin.net/static/deepwarp/
- arxiv: http://arxiv.org/abs/1607.07215
Autoencoding Blade Runner
- blog: https://medium.com/@Terrybroad/autoencoding-blade-runner-88941213abbe#.9kckqg7cq
- github: https://github.com/terrybroad/Learned-Sim-Autoencoder-For-Video-Frames
A guy trained a machine to “watch” Blade Runner. Then things got seriously sci-fi.
http://www.vox.com/2016/6/1/11787262/blade-runner-neural-network-encoding
Deep Convolution Networks for Compression Artifacts Reduction
- intro: ICCV 2015
- project page(code): http://mmlab.ie.cuhk.edu.hk/projects/ARCNN.html
- arxiv: http://arxiv.org/abs/1608.02778
Deep GDashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks
- intro: Deep Genomic Dashboard (Deep GDashboard)
- arxiv: http://arxiv.org/abs/1608.03644
Instagram photos reveal predictive markers of depression
How an Algorithm Learned to Identify Depressed Individuals by Studying Their Instagram Photos
IM2CAD
Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks
- paper: https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/melicher
- github: https://github.com/cupslab/neural_network_cracking
Defeating Image Obfuscation with Deep Learning
Detecting Music BPM using Neural Networks
- keywords: BPM (Beats Per Minutes)
- blog: https://nlml.github.io/neural-networks/detecting-bpm-neural-networks/
- github: https://github.com/nlml/bpm
Generative Visual Manipulation on the Natural Image Manifold
- intro: ECCV 2016
- project page: https://people.eecs.berkeley.edu/~junyanz/projects/gvm/
- arxiv: http://arxiv.org/abs/1609.03552
- github: https://github.com/junyanz/iGAN
Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition
Deep Gold: Using Convolution Networks to Find Minerals
- blog: https://hackernoon.com/deep-gold-using-convolution-networks-to-find-minerals-aafdb37355df#.lgh95ub4a
- github: https://github.com/scottvallance/DeepGold
Predicting First Impressions with Deep Learning
Judging a Book By its Cover
- arxiv: https://arxiv.org/abs/1610.09204
- review: https://www.technologyreview.com/s/602807/deep-neural-network-learns-to-judge-books-by-their-covers/
Image Credibility Analysis with Effective Domain Transferred Deep Networks
A novel image tag completion method based on convolutional neural network
Image operator learning coupled with CNN classification and its application to staff line removal
- intro: ICDAR 2017
- arxiv: https://arxiv.org/abs/1709.06476
Joint Image Filtering with Deep Convolutional Networks
- intro: University of California, Merced & Virginia Tech & University of Illinois
- project page: http://vllab1.ucmerced.edu/~yli62/DJF_residual/
- arxiv: https://arxiv.org/abs/1710.04200
- github: https://github.com/Yijunmaverick/DeepJointFilter
DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks
- intro: ICCV 2017
- arxiv: https://arxiv.org/abs/1704.02470
- github: https://github.com/aiff22/DPED
Neural Scene De-rendering
- intro: CVPR 2017
- project page: http://nsd.csail.mit.edu/
- paper: http://nsd.csail.mit.edu/papers/nsd_cvpr.pdf
- gihtub: https://github.com/jiajunwu/nsd
Image2GIF: Generating Cinemagraphs using Recurrent Deep Q-Networks
- intro: WACV 2018
- project page: http://bvision11.cs.unc.edu/bigpen/yipin/WACV2018/
- arxiv: https://arxiv.org/abs/1801.09042
Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata
https://arxiv.org/abs/1801.09108
Single Image Reflection Removal Using Deep Encoder-Decoder Network
https://arxiv.org/abs/1802.00094
CRRN: Multi-Scale Guided Concurrent Reflection Removal Network
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1805.11802
Learning Deep Convolutional Networks for Demosaicing
https://arxiv.org/abs/1802.03769
Fully convolutional watermark removal attack
ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes
Learning to See in the Dark
- intro: CVPR 2018
- project page: http://web.engr.illinois.edu/~cchen156/SID.html
- arxiv: https://arxiv.org/abs/1805.01934
- github: https://github.com/cchen156/Learning-to-See-in-the-Dark
- video: https://www.youtube.com/watch?v=qWKUFK7MWvg&feature=youtu.be
- video: https://www.bilibili.com/video/av23195280/
Generative Smoke Removal
https://arxiv.org/abs/1902.00311
Mask-ShadowGAN: Learning to Remove Shadows from Unpaired Data
https://arxiv.org/abs/1903.10683
Blind Visual Motif Removal from a Single Image
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1904.02756
Neural Camera Simulators
- intro: CVPR 2021
- arxiv: https://arxiv.org/abs/2104.05237
Lighting the Darkness in the Deep Learning Era
- arxiv: https://arxiv.org/abs/2104.10729
- github: https://github.com/Li-Chongyi/Lighting-the-Darkness-in-the-Deep-Learning-Era-Open
Boundary / Edge / Contour Detection
Holistically-Nested Edge Detection
- intro: ICCV 2015, Marr Prize
- paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Xie_Holistically-Nested_Edge_Detection_ICCV_2015_paper.pdf
- arxiv: http://arxiv.org/abs/1504.06375
- github: https://github.com/s9xie/hed
- github: https://github.com/moabitcoin/holy-edge
Unsupervised Learning of Edges
- intro: CVPR 2016. Facebook AI Research
- arxiv: http://arxiv.org/abs/1511.04166
- zn-blog: http://www.leiphone.com/news/201607/b1trsg9j6GSMnjOP.html
Pushing the Boundaries of Boundary Detection using Deep Learning
Convolutional Oriented Boundaries
- intro: ECCV 2016
- arxiv: http://arxiv.org/abs/1608.02755
Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks
- project page: http://www.vision.ee.ethz.ch/~cvlsegmentation/
- arxiv: https://arxiv.org/abs/1701.04658
- github: https://github.com/kmaninis/COB
Richer Convolutional Features for Edge Detection
- intro: CVPR 2017
- keywords: richer convolutional features (RCF)
- arxiv: https://arxiv.org/abs/1612.02103
- github: https://github.com/yun-liu/rcf
Contour Detection from Deep Patch-level Boundary Prediction
https://arxiv.org/abs/1705.03159
CASENet: Deep Category-Aware Semantic Edge Detection
- intro: CVPR 2017. CMU & Mitsubishi Electric Research Laboratories (MERL)
- arxiv: https://arxiv.org/abs/1705.09759
- code: http://www.merl.com/research/license#CASENet
- video: https://www.youtube.com/watch?v=BNE1hAP6Qho
Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction
- intro: NIPS 2017
- arxiv: https://arxiv.org/abs/1801.00524
Deep Crisp Boundaries: From Boundaries to Higher-level Tasks
https://arxiv.org/abs/1801.02439
DOOBNet: Deep Object Occlusion Boundary Detection from an Image
https://arxiv.org/abs/1806.03772
Dynamic Feature Fusion for Semantic Edge Detection
https://arxiv.org/abs/1902.09104
EDTER: Edge Detection with Transformer
- intro: CVPR 2022
- arxiv: https://arxiv.org/abs/2203.08566
- github: https://github.com/MengyangPu/EDTER
Image Processing
Fast Image Processing with Fully-Convolutional Networks
- intro: ICCV 2017. Qifeng Chen (陈启峰)
- project page: http://www.cqf.io/ImageProcessing/
- arxiv: https://arxiv.org/abs/1709.00643
- supp: https://youtu.be/eQyfHgLx8Dc
- github: https://github.com/CQFIO/FastImageProcessing
DeepISP: Learning End-to-End Image Processing Pipeline
https://arxiv.org/abs/1801.06724
Fully Convolutional Network with Multi-Step Reinforcement Learning for Image Processing
- intro: AAAI 2019
- arxiv: https://arxiv.org/abs/1811.04323
Image-Text
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
https://arxiv.org/abs/1704.03470
Dual-Path Convolutional Image-Text Embedding
Conditional Image-Text Embedding Networks
https://arxiv.org/abs/1711.08389
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
https://arxiv.org/abs/1711.10485
Stacked Cross Attention for Image-Text Matching
https://arxiv.org/abs/1803.08024
Age Estimation
Deeply-Learned Feature for Age Estimation
Age and Gender Classification using Convolutional Neural Networks
- paper: http://www.openu.ac.il/home/hassner/projects/cnn_agegender/CNN_AgeGenderEstimation.pdf
- project page: http://www.openu.ac.il/home/hassner/projects/cnn_agegender/
- github: https://github.com/GilLevi/AgeGenderDeepLearning
Group-Aware Deep Feature Learning For Facial Age Estimation
Local Deep Neural Networks for Age and Gender Classification
https://arxiv.org/abs/1703.08497
Understanding and Comparing Deep Neural Networks for Age and Gender Classification
https://arxiv.org/abs/1708.07689
Age Group and Gender Estimation in the Wild with Deep RoR Architecture
- intro: IEEE ACCESS
- arxiv: https://arxiv.org/abs/1710.02985
Age and gender estimation based on Convolutional Neural Network and TensorFlow
https://github.com/BoyuanJiang/Age-Gender-Estimate-TF
Deep Regression Forests for Age Estimation
- intro: Shanghai University & Johns Hopkins University & Nankai University
- arxiv: https://arxiv.org/abs/1712.07195
Face Aging
Recurrent Face Aging
Face Aging With Conditional Generative Adversarial Networks
Learning Face Age Progression: A Pyramid Architecture of GANs
https://arxiv.org/abs/1711.10352
Face Aging with Contextual Generative Adversarial Nets
- intro: ACM Multimedia 2017
- arxiv: https://arxiv.org/abs/1802.00237
Recursive Chaining of Reversible Image-to-image Translators For Face Aging
https://arxiv.org/abs/1802.05023
Emotion Recognition / Expression Recognition
Real-time emotion recognition for gaming using deep convolutional network features
Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns
- project page: http://www.openu.ac.il/home/hassner/projects/cnn_emotions/
- paper: http://www.openu.ac.il/home/hassner/projects/cnn_emotions/LeviHassnerICMI15.pdf
- github: https://gist.github.com/GilLevi/54aee1b8b0397721aa4b
- blog: https://gilscvblog.com/2017/01/31/emotion-recognition-in-the-wild-via-convolutional-neural-networks-and-mapped-binary-patterns/
DeXpression: Deep Convolutional Neural Network for Expression Recognition
DEX: Deep EXpectation of apparent age from a single image
- intro: ICCV 2015
- paper: https://www.vision.ee.ethz.ch/en/publications/papers/proceedings/eth_biwi_01229.pdf
- homepage: https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/
EmotioNet: EmotioNet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild
- intro: CVPR 2016
- paper: http://cbcsl.ece.ohio-state.edu/cvpr16.pdf
- database: http://cbcsl.ece.ohio-state.edu/dbform_emotionet.html
How Deep Neural Networks Can Improve Emotion Recognition on Video Data
- intro: ICIP 2016
- arxiv: http://arxiv.org/abs/1602.07377
Peak-Piloted Deep Network for Facial Expression Recognition
Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution
A Recursive Framework for Expression Recognition: From Web Images to Deep Models to Game Dataset
FaceNet2ExpNet: Regularizing a Deep Face Recognition Net for Expression Recognition
EmotionNet Challenge
- homrepage: http://cbcsl.ece.ohio-state.edu/EmotionNetChallenge/index.html
- dataset: http://cbcsl.ece.ohio-state.edu/dbform_emotionet.html
Baseline CNN structure analysis for facial expression recognition
- intro: RO-MAN2016 Conference
- arxiv: https://arxiv.org/abs/1611.04251
Facial Expression Recognition using Convolutional Neural Networks: State of the Art
DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network
Deep generative-contrastive networks for facial expression recognition
https://arxiv.org/abs/1703.07140
Convolutional Neural Networks for Facial Expression Recognition
https://arxiv.org/abs/1704.06756
End-to-End Multimodal Emotion Recognition using Deep Neural Networks
- intro: Imperial College London
- arxiv: https://arxiv.org/abs/1704.08619
Spatial-Temporal Recurrent Neural Network for Emotion Recognition
https://arxiv.org/abs/1705.04515
Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units
https://arxiv.org/abs/1706.01509
Temporal Multimodal Fusion for Video Emotion Classification in the Wild
https://arxiv.org/abs/1709.07200
Island Loss for Learning Discriminative Features in Facial Expression Recognition
https://arxiv.org/abs/1710.03144
Real-time Convolutional Neural Networks for Emotion and Gender Classification
https://arxiv.org/abs/1710.07557
Attribution Prediction
PANDA: Pose Aligned Networks for Deep Attribute Modeling
- intro: Facebook. CVPR 2014
- arxiv: http://arxiv.org/abs/1311.5591
- github: https://github.com/facebook/pose-aligned-deep-networks
Predicting psychological attributions from face photographs with a deep neural network
Learning Human Identity from Motion Patterns
Place Recognition
NetVLAD: CNN architecture for weakly supervised place recognition
- intro: CVPR 2016
- intro: Google Street View Time Machine, soft-assignment, Weakly supervised triplet ranking loss
- homepage: http://www.di.ens.fr/willow/research/netvlad/
- arxiv: http://arxiv.org/abs/1511.07247
PlaNet - Photo Geolocation with Convolutional Neural Networks
- arxiv: http://arxiv.org/abs/1602.05314
- review(“Google Unveils Neural Network with “Superhuman” Ability to Determine the Location of Almost Any Image”): https://www.technologyreview.com/s/600889/google-unveils-neural-network-with-superhuman-ability-to-determine-the-location-of-almost/
- github(“City-Recognition: CS231n Project for Winter 2016”): https://github.com/dmakian/LittlePlaNet
- github: https://github.com/wulfebw/LittlePlaNet-Models
Visual place recognition using landmark distribution descriptors
Low-effort place recognition with WiFi fingerprints using deep learning
- arxiv: https://arxiv.org/abs/1611.02049
- github: https://github.com/aqibsaeed/Place-Recognition-using-Autoencoders-and-NN
- github(Keras): https://github.com/mallsk23/place_recognition_wifi_fingerprints_deep_learning
Deep Learning Features at Scale for Visual Place Recognition
- intro: ICRA 2017
- arxiv: https://arxiv.org/abs/1701.05105
Place recognition: An Overview of Vision Perspective
https://arxiv.org/abs/1707.03470
Camera Relocalization
PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization
- paper: http://arxiv.org/abs/1505.07427
- project page: http://mi.eng.cam.ac.uk/projects/relocalisation/#results
- github: https://github.com/alexgkendall/caffe-posenet
- github(TensorFlow): https://github.com/kentsommer/tensorflow-posenet
Modelling Uncertainty in Deep Learning for Camera Relocalization
Random Forests versus Neural Networks - What’s Best for Camera Relocalization?
Deep Convolutional Neural Network for 6-DOF Image Localization
DSAC - Differentiable RANSAC for Camera Localization
Image-based Localization with Spatial LSTMs
VidLoc: 6-DoF Video-Clip Relocalization
Towards CNN Map Compression for camera relocalisation
Camera Relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Network
- intro: Aalto University & Indian Institute of Technology
- arxiv: https://arxiv.org/abs/1707.09733
MapNet: Geometry-Aware Learning of Maps for Camera Localization
- intro: Georgia Institute of Technology & NVIDIA
- arxiv: https://arxiv.org/abs/1712.03342
Image-to-GPS Verification Through A Bottom-Up Pattern Matching Network
https://arxiv.org/abs/1811.07288
Activity Recognition
Implementing a CNN for Human Activity Recognition in Tensorflow
- blog: http://aqibsaeed.github.io/2016-11-04-human-activity-recognition-cnn/
- github: https://github.com/aqibsaeed/Human-Activity-Recognition-using-CNN
Concurrent Activity Recognition with Multimodal CNN-LSTM Structure
CERN: Confidence-Energy Recurrent Network for Group Activity Recognition
- intro: CVPR 2017
- arxiv: https://arxiv.org/abs/1704.03058
Deploying Tensorflow model on Andorid device for Human Activity Recognition
- blog: http://aqibsaeed.github.io/2017-05-02-deploying-tensorflow-model-andorid-device-human-activity-recognition/
- github: https://github.com/aqibsaeed/Human-Activity-Recognition-using-CNN/tree/master/ActivityRecognition
Music Classification / Sound Classification
Explaining Deep Convolutional Neural Networks on Music Classification
- arxiv: http://arxiv.org/abs/1607.02444
- blog: https://keunwoochoi.wordpress.com/2015/12/09/ismir-2015-lbd-auralisation-of-deep-convolutional-neural-networks-listening-to-learned-features-auralization/
- blog: https://keunwoochoi.wordpress.com/2016/03/23/what-cnns-see-when-cnns-see-spectrograms/
- github: https://github.com/keunwoochoi/Auralisation
- audio samples: https://soundcloud.com/kchoi-research
Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
- project page: http://www.stat.ucla.edu/~yang.lu/project/deepFrame/main.html
- arxiv: http://arxiv.org/abs/1608.04363
Convolutional Recurrent Neural Networks for Music Classification
- arxiv: http://arxiv.org/abs/1609.04243
- blog: https://keunwoochoi.wordpress.com/2016/09/15/paper-is-out-convolutional-recurrent-neural-networks-for-music-classification/
- github: https://github.com/keunwoochoi/music-auto_tagging-keras
CNN Architectures for Large-Scale Audio Classification
- intro: Google
- arxiv: https://arxiv.org/abs/1609.09430
- demo: https://www.youtube.com/watch?v=oAAo_r7ZT8U&feature=youtu.be
SoundNet: Learning Sound Representations from Unlabeled Video
- intro: MIT. NIPS 2016
- project page: http://projects.csail.mit.edu/soundnet/
- arxiv: https://arxiv.org/abs/1610.09001
- paper: http://web.mit.edu/vondrick/soundnet.pdf
- github: https://github.com/cvondrick/soundnet
- github: https://github.com/eborboihuc/SoundNet-tensorflow
- youtube: https://www.youtube.com/watch?v=yJCjVvIY4dU
Deep Learning ‘ahem’ detector
- github: https://github.com/worldofpiggy/deeplearning-ahem-detector
- slides: https://docs.google.com/presentation/d/1QXQEOiAMj0uF2_Gafr2bn-kMniUJAIM1PLTFm1mUops/edit#slide=id.g35f391192_00
- mirror: https://pan.baidu.com/s/1c2KGlwO
GenreFromAudio: Finding the genre of a song with Deep Learning
- intro: A pipeline to build a dataset from your own music library and use it to fill the missing genres
- github: https://github.com/despoisj/DeepAudioClassification
TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition
- arxiv: https://arxiv.org/abs/1703.10667
- github: https://github.com/chihyaoma/Activity-Recognition-with-CNN-and-RNN
On the Robustness of Deep Convolutional Neural Networks for Music Classification
- intro: Queen Mary University of London & New York University
- arxiv: https://arxiv.org/abs/1706.02361
NSFW Detection / Classification
Nipple Detection using Convolutional Neural Network
Applying deep learning to classify pornographic images and videos
MODERATE, FILTER, OR CURATE ADULT CONTENT WITH CLARIFAI’S NSFW MODEL
WHAT CONVOLUTIONAL NEURAL NETWORKS LOOK AT WHEN THEY SEE NUDITY
- blog: http://blog.clarifai.com/what-convolutional-neural-networks-see-at-when-they-see-nudity#.VzVh_-yECZY
Open Sourcing a Deep Learning Solution for Detecting NSFW Images
- intro: Yahoo
- blog: https://yahooeng.tumblr.com/post/151148689421/open-sourcing-a-deep-learning-solution-for
- github: https://github.com/yahoo/open_nsfw
Miles Deep - AI Porn Video Editor
- intro: Deep Learning Porn Video Classifier/Editor with Caffe
- github: https://github.com/ryanjay0/miles-deep
Image Reconstruction / Inpainting
Context Encoders: Feature Learning by Inpainting
- intro: CVPR 2016
- intro: Unsupervised Feature Learning by Image Inpainting using GANs
- project page: http://www.cs.berkeley.edu/~pathak/context_encoder/
- arxiv: https://arxiv.org/abs/1604.07379
- github(official): https://github.com/pathak22/context-encoder
- github: https://github.com/BoyuanJiang/context_encoder_pytorch
Semantic Image Inpainting with Perceptual and Contextual Losses
Semantic Image Inpainting with Deep Generative Models
- keywords: Deep Convolutional Generative Adversarial Network (DCGAN)
- arxiv: http://arxiv.org/abs/1607.07539
- github: https://github.com/bamos/dcgan-completion.tensorflow
High-Resolution Image Inpainting using Multi-Scale Neural Patch Synthesis
- intro: University of Southern California & Adobe Research
- arxiv: https://arxiv.org/abs/1611.09969
Face Image Reconstruction from Deep Templates
https://www.arxiv.org/abs/1703.00832
Deep Learning-Guided Image Reconstruction from Incomplete Data
https://arxiv.org/abs/1709.00584
Image Inpainting using Multi-Scale Feature Image Translation
https://arxiv.org/abs/1711.08590
Image Inpainting for High-Resolution Textures using CNN Texture Synthesis
https://arxiv.org/abs/1712.03111
Context-Aware Semantic Inpainting
https://arxiv.org/abs/1712.07778
Deep Blind Image Inpainting
https://arxiv.org/abs/1712.09078
Deep Stacked Networks with Residual Polishing for Image Inpainting
https://arxiv.org/abs/1801.00289
Light-weight pixel context encoders for image inpainting
https://arxiv.org/abs/1801.05585
Deep Structured Energy-Based Image Inpainting
https://arxiv.org/abs/1801.07939
Shift-Net: Image Inpainting via Deep Feature Rearrangement
https://arxiv.org/abs/1801.09392
Cascade context encoder for improved inpainting
https://arxiv.org/abs/1803.04033
SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting
- intro: University of Southern California & Baidu Research
- arxiv: https://arxiv.org/abs/1805.03356
Free-Form Image Inpainting with Gated Convolution
https://arxiv.org/abs/1806.03589
Keras implementation of Image OutPainting
- intro: Stanford CS230 project
- paper: https://cs230.stanford.edu/projects_spring_2018/posters/8265861.pdf
- github: https://github.com/bendangnuksung/Image-OutPainting
Image Inpainting via Generative Multi-column Convolutional Neural Networks
- intro: NIPS 2018
- arxiv: https://arxiv.org/abs/1810.08771
Deep Inception Generative Network for Cognitive Image Inpainting
https://arxiv.org/abs/1812.01458
Foreground-aware Image Inpainting
- intro: University of Rochester & University of Illinois at Urbana-Champaign & Adobe Research
- arxiv: https://arxiv.org/abs/1901.05945
Image Restoration
Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections
- intro: NIPS 2016
- arxiv: http://arxiv.org/abs/1603.09056
Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections
Image Completion with Deep Learning in TensorFlow
Deeply Aggregated Alternating Minimization for Image Restoration
A New Convolutional Network-in-Network Structure and Its Applications in Skin Detection, Semantic Segmentation, and Artifact Reduction
- intro: Seoul National University
- arxiv: https://arxiv.org/abs/1701.06190
MemNet: A Persistent Memory Network for Image Restoration
- intro: ICCV 2017 (Spotlight presentation)
- arxiv: https://arxiv.org/abs/1708.02209
- github: https://github.com/tyshiwo/MemNet
Deep Mean-Shift Priors for Image Restoration
- intro: NIPS 2017
- arxiv: https://arxiv.org/abs/1709.03749
xUnit: Learning a Spatial Activation Function for Efficient Image Restoration
Deep Image Prior
- intro: Skolkovo Institute of Science and Technology & University of Oxford
- project page: https://dmitryulyanov.github.io/deep_image_prior
- arxiv: https://arxiv.org/abs/1711.10925
- paper: https://sites.skoltech.ru/app/data/uploads/sites/25/2017/11/deep_image_prior.pdf
- github: https://github.com//DmitryUlyanov/deep-image-prior
- reddit: https://www.reddit.com/r/MachineLearning/comments/7gls3j/r_deep_image_prior_deep_superresolution/
MemNet: A Persistent Memory Network for Image Restoration
- intro: ICCV 2017 spotlight
- paper: http://cvlab.cse.msu.edu/pdfs/Image_Restoration%20using_Persistent_Memory_Network.pdf
- github: https://github.com//tyshiwo/MemNet
Denoising Prior Driven Deep Neural Network for Image Restoration
https://arxiv.org/abs/1801.06756
Globally and Locally Consistent Image Completion
- intro: SIGGRAPH 2017
- project page: http://hi.cs.waseda.ac.jp/~iizuka/projects/completion/en/
- paper: http://hi.cs.waseda.ac.jp/~iizuka/projects/completion/data/completion_sig2017.pdf
- github(official): https://github.com/satoshiiizuka/siggraph2017_inpainting
- github: https://github.com/akmtn/pytorch-siggraph2017-inpainting
Multi-level Wavelet-CNN for Image Restoration
- intro: CVPR 2018 NTIRE Workshop
- arxiv: https://arxiv.org/abs/1805.07071
Non-Local Recurrent Network for Image Restoration
- intro: University of Illinois at Urbana-Champaign & The Chinese University of Hong Kong
- arxiv: https://arxiv.org/abs/1806.02919
Residual Non-local Attention Networks for Image Restoration
- intro: ICLR 2019
- arxiv: https://arxiv.org/abs/1903.10082
Face Completion
Generative Face Completion
- intro: CVPR 2017
- arxiv: https://arxiv.org/abs/1704.05838
High Resolution Face Completion with Multiple Controllable Attributes via Fully End-to-End Progressive Generative Adversarial Networks
- intro: North Carolina State University
- arxiv: https://arxiv.org/abs/1801.07632
Image Denoising
Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising
- arxiv: http://arxiv.org/abs/1608.03981
- github: https://github.com/cszn/DnCNN
Medical image denoising using convolutional denoising autoencoders
Rectifier Neural Network with a Dual-Pathway Architecture for Image Denoising
Non-Local Color Image Denoising with Convolutional Neural Networks
Joint Visual Denoising and Classification using Deep Learning
- intro: ICIP 2016
- arxiv: https://arxiv.org/abs/1612.01075
- github: https://github.com/ganggit/jointmodel
Deep Convolutional Denoising of Low-Light Images
Deep Class Aware Denoising
End-to-End Learning for Structured Prediction Energy Networks
- intro: University of Massachusetts & CMU
- arxiv: https://arxiv.org/abs/1703.05667
Block-Matching Convolutional Neural Network for Image Denoising
https://arxiv.org/abs/1704.00524
When Image Denoising Meets High-Level Vision Tasks: A Deep Learning Approach
https://arxiv.org/abs/1706.04284
Wide Inference Network for Image Denoising
https://arxiv.org/abs/1707.05414
Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising
- arxiv: https://arxiv.org/abs/1707.09135
- github(MatConvNet): https://github.com/cswin/WIN
Image Denoising via CNNs: An Adversarial Approach
- intro: Indian Institute of Science
- arxiv: https://arxiv.org/abs/1708.00159
An ELU Network with Total Variation for Image Denoising
- intro: 24th International Conference on Neural Information Processing (2017)
- arxiv: https://arxiv.org/abs/1708.04317
Dilated Residual Network for Image Denoising
https://arxiv.org/abs/1708.05473
FFDNet: Toward a Fast and Flexible Solution for CNN based Image Denoising
- arxiv: https://arxiv.org/abs/1710.04026
- github(MatConvNet): https://github.com/cszn/FFDNet
Universal Denoising Networks : A Novel CNN-based Network Architecture for Image Denoising
https://arxiv.org/abs/1711.07807
Burst Denoising with Kernel Prediction Networks
- project page: http://people.eecs.berkeley.edu/~bmild/kpn/
- arxiv: https://arxiv.org/abs/1712.02327
Chaining Identity Mapping Modules for Image Denoising
https://arxiv.org/abs/1712.02933
Deep Burst Denoising
https://arxiv.org/abs/1712.05790
Fast, Trainable, Multiscale Denoising
- intro: Google Research
- arxiv: https://arxiv.org/abs/1802.06130
Training Deep Learning based Denoisers without Ground Truth Data
https://arxiv.org/abs/1803.01314
Identifying Recurring Patterns with Deep Neural Networks for Natural Image Denoising
https://arxiv.org/abs/1806.05229
Class-Aware Fully-Convolutional Gaussian and Poisson Denoising
https://arxiv.org/abs/1808.06562
Connecting Image Denoising and High-Level Vision Tasks via Deep Learning
- intro: IJCAI 2018
- arxiv: https://arxiv.org/abs/1809.01826
- github: https://github.com/Ding-Liu/DeepDenoising
DN-ResNet: Efficient Deep Residual Network for Image Denoising
https://arxiv.org/abs/1810.06766
Deep Learning for Image Denoising: A Survey
https://arxiv.org/abs/1810.05052
Image Dehazing / Image Haze Removal
DehazeNet: An End-to-End System for Single Image Haze Removal
An All-in-One Network for Dehazing and Beyond
- intro: All-in-One Dehazing Network (AOD-Net)
- arxiv: https://arxiv.org/abs/1707.06543
Joint Transmission Map Estimation and Dehazing using Deep Networks
https://arxiv.org/abs/1708.00581
End-to-End United Video Dehazing and Detection
https://arxiv.org/abs/1709.03919
Image Dehazing using Bilinear Composition Loss Function
https://arxiv.org/abs/1710.00279
Learning Aggregated Transmission Propagation Networks for Haze Removal and Beyond
https://arxiv.org/abs/1711.06787
CANDY: Conditional Adversarial Networks based Fully End-to-End System for Single Image Haze Removal
https://arxiv.org/abs/1801.02892
C2MSNet: A Novel approach for single image haze removal
- intro: WACV 2018
- arxiv: https://arxiv.org/abs/1801.08406
A Cascaded Convolutional Neural Network for Single Image Dehazing
- intro: IEEE ACCESS
- arxiv: https://arxiv.org/abs/1803.07955
Densely Connected Pyramid Dehazing Network
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1803.08396
- github: https://github.com/hezhangsprinter/DCPDN
Gated Fusion Network for Single Image Dehazing
- project page: https://sites.google.com/site/renwenqi888/research/dehazing/gfn
- arxiv: https://arxiv.org/abs/1804.00213
Semantic Single-Image Dehazing
https://arxiv.org/abs/1804.05624
Perceptually Optimized Generative Adversarial Network for Single Image Dehazing
https://arxiv.org/abs/1805.01084
PAD-Net: A Perception-Aided Single Image Dehazing Network
- arxiv: https://arxiv.org/abs/1805.03146
- github: https://github.com/guanlongzhao/single-image-dehazing
The Effectiveness of Instance Normalization: a Strong Baseline for Single Image Dehazing
https://arxiv.org/abs/1805.03305
Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing
- intro: CVPRW: NTIRE 2018
- arxiv: https://arxiv.org/abs/1805.05308
Deep learning for dehazing: Comparison and analysis
- intro: CVCS 2018
- arxiv: https://arxiv.org/abs/1806.10923
Generic Model-Agnostic Convolutional Neural Network for Single Image Dehazing
https://arxiv.org/abs/1810.02862
Image Rain Removal / De-raining
Clearing the Skies: A deep network architecture for single-image rain removal
- intro: DerainNet
- project page: http://smartdsp.xmu.edu.cn/derainNet.html
- arxiv: http://arxiv.org/abs/1609.02087
- code(Matlab): http://smartdsp.xmu.edu.cn/memberpdf/fuxueyang/derainNet/code.zip
Joint Rain Detection and Removal via Iterative Region Dependent Multi-Task Learning
Image De-raining Using a Conditional Generative Adversarial Network
Single Image Deraining using Scale-Aware Multi-Stage Recurrent Network
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1712.06830
Deep joint rain and haze removal from single images
https://arxiv.org/abs/1801.06769
Density-aware Single Image De-raining using a Multi-stream Dense Network
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1803.08396
- github: https://github.com/hezhangsprinter/DID-MDN
Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework
https://arxiv.org/abs/1803.10433
Fast Single Image Rain Removal via a Deep Decomposition-Composition Network
https://arxiv.org/abs/1804.02688
Residual-Guide Feature Fusion Network for Single Image Deraining
https://arxiv.org/abs/1804.07493
Lightweight Pyramid Networks for Image Deraining
https://arxiv.org/abs/1805.06173
Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1807.05698
- code: https://xialipku.github.io/RESCAN/
Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining
- intro: ACM Multimedia 2018
- arxiv: https://arxiv.org/abs/1808.01491
Gated Context Aggregation Network for Image Dehazing and Deraining
- intro: WACV 2019
- arxiv: https://arxiv.org/abs/1811.08747
A Deep Tree-Structured Fusion Model for Single Image Deraining
https://arxiv.org/abs/1811.08632
A^2Net: Adjacent Aggregation Networks for Image Raindrop Removal
https://arxiv.org/abs/1811.09780
Single Image Deraining: A Comprehensive Benchmark Analysis
- arxiv: https://arxiv.org/abs/1903.08558
- github: https://github.com/lsy17096535/Single-Image-Deraining
Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset
- intro: CVPR 2019
- project pge: https://stevewongv.github.io/derain-project.html
- arxiv: https://arxiv.org/abs/1904.01538
Fence Removal
My camera can see through fences: A deep learning approach for image de-fencing
- intro: ACPR 2015
- arxiv: https://arxiv.org/abs/1805.07442
Deep learning based fence segmentation and removal from an image using a video sequence
- intro: ECCV Workshop on Video Segmentation, 2016
- arxiv: http://arxiv.org/abs/1609.07727
Accurate and efficient video de-fencing using convolutional neural networks and temporal information
https://arxiv.org/abs/1806.10781
Snow Removal
DesnowNet: Context-Aware Deep Network for Snow Removal
https://arxiv.org/abs/1708.04512
Blur Detection and Removal
Learning to Deblur
Learning a Convolutional Neural Network for Non-uniform Motion Blur Removal
End-to-End Learning for Image Burst Deblurring
Deep Video Deblurring
- intro: CVPR 2017 spotlight paper
- project page(code+dataset): http://www.cs.ubc.ca/labs/imager/tr/2017/DeepVideoDeblurring/
- arxiv: https://arxiv.org/abs/1611.08387 https://github.com/shuochsu/DeepVideoDeblurring
Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring
- arxiv: https://arxiv.org/abs/1612.02177
- github(official. Torch)): https://github.com/SeungjunNah/DeepDeblur_release
From Motion Blur to Motion Flow: a Deep Learning Solution for Removing Heterogeneous Motion Blur
Motion Deblurring in the Wild
Deep Face Deblurring
https://arxiv.org/abs/1704.08772
Learning Blind Motion Deblurring
- intro: ICCV 2017
- arxiv: https://arxiv.org/abs/1708.04208
Deep Generative Filter for Motion Deblurring
https://arxiv.org/abs/1709.03481
DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks
- intro: Ukrainian Catholic University & CTU in Prague
- arxiv: https://arxiv.org/abs/1711.07064
- github: https://github.com/KupynOrest/DeblurGAN
DeepDeblur: Fast one-step blurry face images restoration
https://arxiv.org/abs/1711.09515
Reblur2Deblur: Deblurring Videos via Self-Supervised Learning
- arxiv: https://arxiv.org/abs/1801.05117
- supplementary: https://drive.google.com/file/d/17Itta-z89lpWUdvUjpafKzJRSLoxHF5c/view
Scale-recurrent Network for Deep Image Deblurring
- intro: CUHK & Tecent & Megvii Inc.
- arxiv: https://arxiv.org/abs/1802.01770
Deep Semantic Face Deblurring
- intro: CVPR 2018. Beijing Institute of Technology & University of California, Merced & Nvidia Research
- project page: https://sites.google.com/site/ziyishenmi/cvpr18_face_deblur
- arxiv: https://arxiv.org/abs/1803.03345
Motion deblurring of faces
https://arxiv.org/abs/1803.03330
Learning a Discriminative Prior for Blind Image Deblurring
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1803.03363
Adversarial Spatio-Temporal Learning for Video Deblurring
https://arxiv.org/abs/1804.00533
Learning to Deblur Images with Exemplars
- intro: PAMI 2018
- arxiv: https://arxiv.org/abs/1805.05503
Down-Scaling with Learned Kernels in Multi-Scale Deep Neural Networks for Non-Uniform Single Image Deblurring
https://arxiv.org/abs/1903.10157
Image Compression
An image compression and encryption scheme based on deep learning
Full Resolution Image Compression with Recurrent Neural Networks
- arxiv: http://arxiv.org/abs/1608.05148
- github: https://github.com/tensorflow/models/tree/master/compression
Image Compression with Neural Networks
Lossy Image Compression With Compressive Autoencoders
- paper: http://openreview.net/pdf?id=rJiNwv9gg
- review: http://qz.com/835569/twitter-is-getting-close-to-making-all-your-pictures-just-a-little-bit-smaller/
End-to-end Optimized Image Compression
- arxiv: https://arxiv.org/abs/1611.01704
- notes: https://blog.acolyer.org/2017/05/08/end-to-end-optimized-image-compression/
CAS-CNN: A Deep Convolutional Neural Network for Image Compression Artifact Suppression
Semantic Perceptual Image Compression using Deep Convolution Networks
- intro: Accepted to Data Compression Conference
- intro: Semantic JPEG image compression using deep convolutional neural network (CNN)
- arxiv: https://arxiv.org/abs/1612.08712
- github: https://github.com/iamaaditya/image-compression-cnn
Generative Compression
- intro: MIT
- arxiv: https://arxiv.org/abs/1703.01467
Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks
https://arxiv.org/abs/1703.10114
Learning Convolutional Networks for Content-weighted Image Compression
https://arxiv.org/abs/1703.10553
Real-Time Adaptive Image Compression
- intro: ICML 2017
- keywords: GAN
- project page: http://www.wave.one/icml2017
- arxiv: https://arxiv.org/abs/1705.05823
Learning to Inpaint for Image Compression
https://arxiv.org/abs/1709.08855
Efficient Trimmed Convolutional Arithmetic Encoding for Lossless Image Compression
https://arxiv.org/abs/1801.04662
Conditional Probability Models for Deep Image Compression
https://arxiv.org/abs/1801.04260
Multiple Description Convolutional Neural Networks for Image Compression
https://arxiv.org/abs/1801.06611
Near-lossless L-infinity constrained Multi-rate Image Decompression via Deep Neural Network
https://arxiv.org/abs/1801.07987
DeepSIC: Deep Semantic Image Compression
https://arxiv.org/abs/1801.09468
Spatially adaptive image compression using a tiled deep network
- intro: ICIP 2017
- arxiv: https://arxiv.org/abs/1802.02629
Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples
- intro: IJCAI 2018
- arxiv: https://arxiv.org/abs/1803.05787
DeepN-JPEG: A Deep Neural Network Favorable JPEG-based Image Compression Framework
- intro: DAC 2018
- arxiv: https://arxiv.org/abs/1803.05788
The Effects of JPEG and JPEG2000 Compression on Attacks using Adversarial Examples
https://arxiv.org/abs/1803.10418
Generative Adversarial Networks for Extreme Learned Image Compression
- intro: ETH Zurich
- homepage: https://data.vision.ee.ethz.ch/aeirikur/extremecompression/
- arxiv: https://arxiv.org/abs/1804.02958
Deformation Aware Image Compression
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1804.04593
Neural Multi-scale Image Compression
https://arxiv.org/abs/1805.06386
Deep Image Compression via End-to-End Learning
https://arxiv.org/abs/1806.01496
Image Quality Assessment
Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
Image Blending
GP-GAN: Towards Realistic High-Resolution Image Blending
- project page: https://wuhuikai.github.io/GP-GAN-Project/
- arxiv: https://arxiv.org/abs/1703.07195
- github(Official, Chainer): https://github.com/wuhuikai/GP-GAN
Image Enhancement
Deep Bilateral Learning for Real-Time Image Enhancement
- intro: MIT & Google Research
- arxiv: https://arxiv.org/abs/1707.02880
Aesthetic-Driven Image Enhancement by Adversarial Learning
- intro: CUHK
- arxiv: https://arxiv.org/abs/1707.05251
Learned Perceptual Image Enhancement
https://arxiv.org/abs/1712.02864
Deep Underwater Image Enhancement
https://arxiv.org/abs/1807.03528
Abnormality Detection / Anomaly Detection
Toward a Taxonomy and Computational Models of Abnormalities in Images
GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training
https://arxiv.org/abs/1805.06725
Depth Prediction / Depth Estimation
Deep Convolutional Neural Fields for Depth Estimation from a Single Image
- intro: CVPR 2015
- arxiv: https://arxiv.org/abs/1411.6387
Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields
- intro: IEEE T. Pattern Analysis and Machine Intelligence
- arxiv: https://arxiv.org/abs/1502.07411
- bitbucket: https://bitbucket.org/fayao/dcnf-fcsp
Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue
- intro: ECCV 2016
- arxiv: https://arxiv.org/abs/1603.04992
- github: https://github.com/Ravi-Garg/Unsupervised_Depth_Estimation
Depth from a Single Image by Harmonizing Overcomplete Local Network Predictions
- intro: NIPS 2016
- project pag: http://ttic.uchicago.edu/~ayanc/mdepth/
- arxiv: http://arxiv.org/abs/1605.07081
- github: https://github.com/ayanc/mdepth/
Deeper Depth Prediction with Fully Convolutional Residual Networks
Single image depth estimation by dilated deep residual convolutional neural network and soft-weight-sum inference
https://arxiv.org/abs/1705.00534
Monocular Depth Estimation with Hierarchical Fusion of Dilated CNNs and Soft-Weighted-Sum Inference
- intro: Northwestern Polytechnical University
- arxiv: https://arxiv.org/abs/1708.02287
Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image
- arxiv: https://arxiv.org/abs/1709.07492
- video: https://www.youtube.com/watch?v=vNIIT_M7x7Y
- github: https://github.com/fangchangma/sparse-to-dense
Size-to-depth: A New Perspective for Single Image Depth Estimation
https://arxiv.org/abs/1801.04461
Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1808.00150
Rethinking Monocular Depth Estimation with Adversarial Training
https://arxiv.org/abs/1808.07528
CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth
- intro: CVPR 2019
- project page: http://webdiis.unizar.es/~jmfacil/camconvs/
- arxiv: https://arxiv.org/abs/1904.02028
Texture Synthesis
Texture Synthesis Using Convolutional Neural Networks
Texture Networks: Feed-forward Synthesis of Textures and Stylized Images
- intro: IMCL 2016
- arxiv: http://arxiv.org/abs/1603.03417
- github: https://github.com/DmitryUlyanov/texture_nets
- notes: https://blog.acolyer.org/2016/09/23/texture-networks-feed-forward-synthesis-of-textures-and-stylized-images/
Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks
- arxiv: http://arxiv.org/abs/1604.04382
- github(Torch): https://github.com/chuanli11/MGANs
Texture Synthesis with Spatial Generative Adversarial Networks
Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis
- intro: Skolkovo Institute of Science and Technology & Yandex & University of Oxford
- arxiv: https://arxiv.org/abs/1701.02096
Deep TEN: Texture Encoding Network
- intro: CVPR 2017
- project page: http://zhanghang1989.github.io/DeepEncoding/
- arxiv: https://arxiv.org/abs/1612.02844
- github: https://github.com/zhanghang1989/Deep-Encoding
- notes: https://zhuanlan.zhihu.com/p/25013378
Diversified Texture Synthesis with Feed-forward Networks
- intro: CVPR 2017. University of California & Adobe Research
- arxiv: https://arxiv.org/abs/1703.01664
- github: https://github.com/Yijunmaverick/MultiTextureSynthesis
Image Cropping
Deep Cropping via Attention Box Prediction and Aesthetics Assessment
- intro: ICCV 2017
- arxiv: https://arxiv.org/abs/1710.08014
A2-RL: Aesthetics Aware Reinforcement Learning for Automatic Image Cropping
- intro: CVPR 2018
- project page: http://debangli.info/A2RL/
- arxiv: https://arxiv.org/abs/1709.04595
- github(official): https://github.com/wuhuikai/TF-A2RL
- demo: http://wuhuikai.me/TF-A2RL/
Automatic Image Cropping for Visual Aesthetic Enhancement Using Deep Neural Networks and Cascaded Regression
- intro: IEEE Transactions on Multimedia, 2017
- arxiv: https://arxiv.org/abs/1712.09048
Grid Anchor based Image Cropping: A New Benchmark and An Efficient Model
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1909.08989
- github: https://github.com/HuiZeng/Grid-Anchor-based-Image-Cropping-Pytorch
Image Cropping with Composition and Saliency Aware Aesthetic Score Map
- intro: AAAI 2020
- arxiv: https://arxiv.org/abs/1911.10492
Image Synthesis
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis
Generative Adversarial Text to Image Synthesis
- intro: ICML 2016
- arxiv: http://arxiv.org/abs/1605.05396
- github(Tensorflow): https://github.com/paarthneekhara/text-to-image
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
- intro: Rutgers University & Lehigh University & The Chinese University of Hong Kong & University of North Carolina at Charlotte
- arxiv: https://arxiv.org/abs/1612.03242
- github: https://github.com/hanzhanggit/StackGAN
- github: https://github.com/brangerbriz/docker-StackGAN
Semantic Image Synthesis via Adversarial Learning
- intro: ICCV 2017
- arxiv: https://arxiv.org/abs/1707.06873
- github(PyTorch): https://github.com//woozzu/dong_iccv_2017
An Introduction to Image Synthesis with Generative Adversarial Nets
- intro: University of Illinois at Chicago & Toutiao AI Lab
- arxiv: https://arxiv.org/abs/1803.04469
Text Guided Person Image Synthesis
- intr: CVPR 2019
- intro: Zhejiang University & Nanjing University
- arxiv: https://arxiv.org/abs/1904.05118
Image Tagging
Fast Zero-Shot Image Tagging
Flexible Image Tagging with Fast0Tag
Sampled Image Tagging and Retrieval Methods on User Generated Content
- arxiv: https://arxiv.org/abs/1611.06962
- github: https://github.com/lab41/attalos
Kill Two Birds with One Stone: Weakly-Supervised Neural Network for Image Annotation and Tag Refinement
- intro: AAAI 2018
- arxiv: https://arxiv.org/abs/1711.06998
Deep Multiple Instance Learning for Zero-shot Image Tagging
https://arxiv.org/abs/1803.06051
Image Matching
Learning Fine-grained Image Similarity with Deep Ranking
- intro: CVPR 2014
- intro: Triplet Sampling
- arxiv: http://arxiv.org/abs/1404.4661
Learning to compare image patches via convolutional neural networks
- intro: CVPR 2015. siamese network
- project page: http://imagine.enpc.fr/~zagoruys/deepcompare.html
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zagoruyko_Learning_to_Compare_2015_CVPR_paper.pdf
- github: https://github.com/szagoruyko/cvpr15deepcompare
MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching
- intro: CVPR 2015. siamese network
- paper: http://www.cs.unc.edu/~xufeng/cs/papers/cvpr15-matchnet.pdf
- extended abstract: http://www.cv-foundation.org/openaccess/content_cvpr_2015/ext/2A_114_ext.pdf
- github: https://github.com/hanxf/matchnet
Fashion Style in 128 Floats
- intro: CVPR 2016. StyleNet
- project page: http://hi.cs.waseda.ac.jp/~esimo/en/research/stylenet/
- paper: http://hi.cs.waseda.ac.jp/~esimo/publications/SimoSerraCVPR2016.pdf
- github: https://github.com/bobbens/cvpr2016_stylenet
Fully-Trainable Deep Matching
- intro: BMVC 2016
- project page: http://lear.inrialpes.fr/src/deepmatching/
- arxiv: http://arxiv.org/abs/1609.03532
Local Similarity-Aware Deep Feature Embedding
- intro: NIPS 2016
- arxiv: https://arxiv.org/abs/1610.08904
Convolutional neural network architecture for geometric matching
- intro: CVPR 2017. Inria
- project page: http://www.di.ens.fr/willow/research/cnngeometric/
- arxiv: https://arxiv.org/abs/1703.05593
- github: https://github.com/ignacio-rocco/cnngeometric_matconvnet
Multi-Image Semantic Matching by Mining Consistent Features
https://arxiv.org/abs/1711.07641
Image Editing
Neural Photo Editing with Introspective Adversarial Networks
- intro: Heriot-Watt University
- arxiv: http://arxiv.org/abs/1609.07093
- github: https://github.com/ajbrock/Neural-Photo-Editor
Deep Feature Interpolation for Image Content Changes
- intro: CVPR 2017. Cornell University & Washington University
- arxiv: https://arxiv.org/abs/1611.05507
- github(official): https://github.com/paulu/deepfeatinterp
- github: https://github.com/slang03/dfi-tensorflow
Invertible Conditional GANs for image editing
- intro: NIPS 2016 Workshop on Adversarial Training
- arxiv: https://arxiv.org/abs/1611.06355
- github: https://github.com/Guim3/IcGAN
Semantic Facial Expression Editing using Autoencoded Flow
- intro: University of Illinois at Urbana-Champaign & The Chinese University of Hong Kong & Google
- arxiv: https://arxiv.org/abs/1611.09961
Language-Based Image Editing with Recurrent Attentive Models
https://arxiv.org/abs/1711.06288
Face Swap & Face Editing
Fast Face-swap Using Convolutional Neural Networks
- intro: Ghent University & Twitter
- arxiv: https://arxiv.org/abs/1611.09577
Neural Face Editing with Intrinsic Image Disentangling
- intro: CVPR 2017 oral
- project page: http://www3.cs.stonybrook.edu/~cvl/content/neuralface/neuralface.html
- arxiv: https://arxiv.org/abs/1704.04131
Arbitrary Facial Attribute Editing: Only Change What You Want
RSGAN: Face Swapping and Editing using Face and Hair Representation in Latent Spaces
https://arxiv.org/abs/1804.03447
FaceShop: Deep Sketch-based Face Image Editing
https://arxiv.org/abs/1804.08972
Stereo
End-to-End Learning of Geometry and Context for Deep Stereo Regression
https://arxiv.org/abs/1703.04309
Unsupervised Adaptation for Deep Stereo
- intro: ICCV 2017
- paper: http://openaccess.thecvf.com/content_ICCV_2017/papers/Tonioni_Unsupervised_Adaptation_for_ICCV_2017_paper.pdf
- paper: http://vision.disi.unibo.it/~mpoggi/papers/iccv2017_adaptation.pdf
- github: https://github.com/CVLAB-Unibo/Unsupervised-Adaptation-for-Deep-Stereo
Cascade Residual Learning: A Two-stage Convolutional Neural Network for Stereo Matching
https://arxiv.org/abs/1708.09204
StereoConvNet: Stereo convolutional neural network for depth map prediction from stereo images
EdgeStereo: A Context Integrated Residual Pyramid Network for Stereo Matching
https://arxiv.org/abs/1803.05196
Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains
- intro: CVPR 2018. SenseTime Research & Sun Yat-sen University
- arxiv: https://arxiv.org/abs/1803.06641
Pyramid Stereo Matching Network
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1803.08669
- github: https://github.com/JiaRenChang/PSMNet
Cascaded multi-scale and multi-dimension convolutional neural network for stereo matching
https://arxiv.org/abs/1803.09437
Left-Right Comparative Recurrent Model for Stereo Matching
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1804.00796
Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching
https://arxiv.org/abs/1806.01677
Open-World Stereo Video Matching with Deep RNN
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1808.03959
Real-time self-adaptive deep stereo
https://arxiv.org/abs/1810.05424 https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stereo
Group-wise Correlation Stereo Network
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1903.04025
- github(official): https://github.com/xy-guo/GwcNet
Self-calibrating Deep Photometric Stereo Networks
- intro: CVPR 2019 oral
- intro: The University of Hong Kong & University of Oxford & Peking University & Peng Cheng Laboratory & Osaka University
- keywords: Learning Based Uncalibrated Photometric Stereo for Non-Lambertian Surface
- project page: http://gychen.org/SDPS-Net/
- arxiv: https://arxiv.org/abs/1903.07366
- github(official, PyTorch): https://github.com/guanyingc/SDPS-Net
Learning to Adapt for Stereo
- intro: CVPR 2019
- intro: University of Bologna & University of Oxford & Australian National University & FiveAI
- arxiv: https://arxiv.org/abs/1904.02957
- github: https://github.com/CVLAB-Unibo/Learning2AdaptForStereo
StereoDRNet: Dilated Residual Stereo Net
- intro: CVPR 2019
- intro: University of North Carolina at Chapel Hill & Facebook Reality Labs
- arxiv: https://arxiv.org/abs/1904.02251
GA-Net: Guided Aggregation Net for End-to-end Stereo Matching
- intro: CVPR 2019 oral
- intro: University of Oxford & Baidu Research
- arxiv: https://arxiv.org/abs/1904.06587
Multi-Scale Geometric Consistency Guided Multi-View Stereo
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1904.08103
Guided Stereo Matching
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1905.10107
OmniMVS: End-to-End Learning for Omnidirectional Stereo Matching
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1908.06257
DeepPruner: Learning Efficient Stereo Matching via Differentiable PatchMatch
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1909.05845
Revisiting Stereo Depth Estimation From a Sequence-to-Sequence Perspective with Transformers
- intro: Johns Hopkins University
- keywords: STereo TRansformer (STTR)
- arxiv: https://arxiv.org/abs/2011.02910
- github: https://github.com/mli0603/stereo-transformer
EGFN: Efficient Geometry Feature Network for Fast Stereo 3D Object Detection
- intro: Tianjin University
- arxiv: https://arxiv.org/abs/2111.14055
3D
Learning Spatiotemporal Features with 3D Convolutional Networks
C3D: Generic Features for Video Analysis
- project page: http://vlg.cs.dartmouth.edu/c3d/
- arxiv: http://arxiv.org/abs/1412.0767
- slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w07-conv3d.pdf
- github: https://github.com/facebook/C3D
C3D Model for Keras trained over Sports 1M
Sports 1M C3D Network to Keras
Deep End2End Voxel2Voxel Prediction
Aligning 3D Models to RGB-D Images of Cluttered Scenes
Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images
- homepage: http://dss.cs.princeton.edu/
- arxiv: http://arxiv.org/abs/1511.02300
Multi-view 3D Models from Single Images with a Convolutional Network
RotationNet: Learning Object Classification Using Unsupervised Viewpoint Estimation
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
- paper: http://deepcontext.cs.princeton.edu/paper.pdf
- project page: http://deepcontext.cs.princeton.edu/
Volumetric and Multi-View CNNs for Object Classification on 3D Data
- homepage: http://graphics.stanford.edu/projects/3dcnn/
- arxiv: https://arxiv.org/abs/1604.03265
- github: https://github.com/charlesq34/3dcnn.torch
Deep3D: Automatic 2D-to-3D Video Conversion with CNNs
- project page: http://dmlc.ml/mxnet/2016/04/04/deep3d-automatic-2d-to-3d-conversion-with-CNN.html
- paper: http://homes.cs.washington.edu/~jxie/pdf/deep3d.pdf
- github: https://github.com/piiswrong/deep3d
Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks
3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction
Body Meshes as Points
- intro: CVPR 2021
- intro: National University of Singapore & ByteDance AI Lab & Yitu Technology
- arxiv: https://arxiv.org/abs/2105.02467
- github: https://github.com/jfzhang95/BMP
Deep Learning for Makeup
Makeup like a superstar: Deep Localized Makeup Transfer Network
- intro: IJCAI 2016
- arxiv: http://arxiv.org/abs/1604.07102
Makeup-Go: Blind Reversion of Portrait Edit
- intro: The Chinese University of Hong Kong & Tencent Youtu Lab
- paper: http://openaccess.thecvf.com/content_ICCV_2017/papers/Chen_Makeup-Go_Blind_Reversion_ICCV_2017_paper.pdf
- paper: http://open.youtu.qq.com/content/file/iccv17_makeupgo.pdf
Music Tagging
Automatic tagging using deep convolutional neural networks
- arxiv: https://arxiv.org/abs/1606.00298
- github: https://github.com/keunwoochoi/music-auto_tagging-keras
Music tagging and feature extraction with MusicTaggerCRNN
https://keras.io/applications/#music-tagging-and-feature-extraction-with-musictaggercrnn
Action Recognition
Single Image Action Recognition by Predicting Space-Time Saliency
https://arxiv.org/abs/1705.04641
Attentional Pooling for Action Recognition
- intro: NIPS 2017
- project page: https://rohitgirdhar.github.io/AttentionalPoolingAction/
- arxiv: https://arxiv.org/abs/1711.01467
- github: https://github.com/rohitgirdhar/AttentionalPoolingAction/
Memory Attention Networks for Skeleton-based Action Recognition
- intro: IJCAI 2018
- keywords: Temporal Attention Recalibration Module (TARM) and a Spatio-Temporal Convolution Module (STCM)
- arixv: https://arxiv.org/abs/1804.08254
- github: https://github.com/memory-attention-networks
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
- intro: MIT-IBM Watson AI Lab & MIT
- arxiv: https://arxiv.org/abs/2010.11757
- github: https://github.com/IBM/action-recognition-pytorch
Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition
- intro: ECCV 2020
- paper: https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123690528.pdf
- github: https://github.com/kchengiva/DecoupleGCN-DropGraph
Temporal-Relational CrossTransformers for Few-Shot Action Recognition
- intro: University of Bristol
- arxiv: https://arxiv.org/abs/2101.06184
- github: https://github.com/tobyperrett/trx
CTR Prediction
Deep CTR Prediction in Display Advertising
- intro: ACM Multimedia Conference 2016
- arxiv: https://arxiv.org/abs/1609.06018
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
- intro: Harbin Institute of Technology & Huawei
- arxiv: https://arxiv.org/abs/1703.04247
Deep Interest Network for Click-Through Rate Prediction
- intro: Alibaba Inc.
- arxiv: https://arxiv.org/abs/1706.06978
Image Matters: Jointly Train Advertising CTR Model with Image Representation of Ad and User Behavior
- intro: Alibaba Inc.
- arxiv: https://arxiv.org/abs/1711.06505
Cryptography
Learning to Protect Communications with Adversarial Neural Cryptography
- intro: Google Brain
- arxiv: https://arxiv.org/abs/1610.06918
- github(Theano): https://github.com/nlml/adversarial-neural-crypt
- github(TensorFlow): https://github.com/ankeshanand/neural-cryptography-tensorflow
Adversarial Neural Cryptography in Theano
Embedding Watermarks into Deep Neural Networks
Digital Watermarking for Deep Neural Networks
- intro: International Journal of Multimedia Information Retrieval
- arxiv: https://arxiv.org/abs/1802.02601
Cyber Security
Collection of Deep Learning Cyber Security Research Papers
Lip Reading
LipNet: Sentence-level Lipreading
LipNet: End-to-End Sentence-level Lipreading
- arxiv: https://arxiv.org/abs/1611.01599
- paper: http://openreview.net/pdf?id=BkjLkSqxg
- github: https://github.com/bshillingford/LipNet
Lip Reading Sentences in the Wild
- intro: University of Oxford & Google DeepMind
- arxiv: https://arxiv.org/abs/1611.05358
- youtube: https://www.youtube.com/watch?v=5aogzAUPilE
Combining Residual Networks with LSTMs for Lipreading
End-to-End Multi-View Lipreading
- intro: BMVC 2017
- arxiv: https://arxiv.org/abs/1709.00443
LCANet: End-to-End Lipreading with Cascaded Attention-CTC
- intro: FG 2018
- arxiv: https://arxiv.org/abs/1803.04988
Event Recognition
Better Exploiting OS-CNNs for Better Event Recognition in Images
Transferring Object-Scene Convolutional Neural Networks for Event Recognition in Still Images
IOD-CNN: Integrating Object Detection Networks for Event Recognition
https://arxiv.org/abs/1703.07431
Trajectory Prediction
Trajformer: Trajectory Prediction with Local Self-Attentive Contexts for Autonomous Driving
- intro: Machine Learning for Autonomous Driving @ NeurIPS 2020
- intro: Carnegie Mellon University & Bosch Research Pittsburgh
- arxiv: https://arxiv.org/abs/2011.14910
Human-Object Interaction
Learning Human-Object Interactions by Graph Parsing Neural Networks
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1808.07962
- github: https://github.com/SiyuanQi/gpnn
Interact as You Intend: Intention-Driven Human-Object Interaction Detection
https://arxiv.org/abs/1808.09796
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
- intro: BMVC 2018
- project page: https://gaochen315.github.io/iCAN/
- arxiv: https://arxiv.org/abs/1808.10437
- github: https://github.com/vt-vl-lab/iCAN
Pose-aware Multi-level Feature Network for Human Object Interaction Detection
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1909.08453
End-to-End Human Object Interaction Detection with HOI Transformer
- intro: CVPR 2021
- intro: MEGVII Technology
- arxiv: https://arxiv.org/abs/2103.04503
- github: https://github.com/bbepoch/HoiTransformer
Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction
- intro: ICCV 2021
- intro: OPPO Research Institute & Xmotors & University of California
- arxiv: https://arxiv.org/abs/2110.03278
Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and Guided Progressive Label Correction
- intro: BMVC 2021
- arxiv: https://arxiv.org/abs/2110.10174
- github: https://github.com/takumayagi/hand_object_contact_prediction
Deep Learning in Finance
Deep Learning in Finance
A Survey of Deep Learning Techniques Applied to Trading
Deep Learning and Long-Term Investing
- part 1: http://www.euclidean.com/deep-learning-long-term-investing-1
- part 2: http://www.euclidean.com/deep-learning-investing-part-2-preprocessing-data
Deep Learning in Trading
Research to Products: Machine & Human Intelligence in Finance
- intro: Peter Sarlin, Hanken School of Economics - Deep Learning in Finance Summit 2016 #reworkfin
- youtube: https://www.youtube.com/watch?v=Fd7Cc-KOVXg
- mirror: https://pan.baidu.com/s/1kVpZKur#list/path=%2F
eep Neural Networks for Real-time Market Predictions
Deep Learning the Stock Market
- blog: https://medium.com/@TalPerry/deep-learning-the-stock-market-df853d139e02#.z752rf43u
- github: https://github.com/talolard/MarketVectors
rl_portfolio
- intro: This Repository uses Reinforcement Learning and Supervised learning to Optimize portfolio allocation.
- github: https://github.com/deependersingla/deep_portfolio
Neural networks for algorithmic trading. Multivariate time series
- blog: https://medium.com/@alexrachnog/neural-networks-for-algorithmic-trading-2-1-multivariate-time-series-ab016ce70f57
- github: https://github.com/Rachnog/Deep-Trading/tree/master/multivariate
Deep-Trading: Algorithmic trading with deep learning experiments
https://github.com/Rachnog/Deep-Trading
Neural networks for algorithmic trading. Multimodal and multitask deep learning
- blog: https://becominghuman.ai/neural-networks-for-algorithmic-trading-multimodal-and-multitask-deep-learning-5498e0098caf
- github: https://github.com/Rachnog/Deep-Trading/tree/master/multimodal
Deep Learning with Python in Finance - Singapore Python User Group
A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem
- intro: Xi’an Jiaotong-Liverpool University
- keywords: PGPortfolio: Policy Gradient Portfolio
- arxiv: https://arxiv.org/abs/1706.10059
- github: https://github.com//ZhengyaoJiang/PGPortfolio
Stock Prediction: a method based on extraction of news features and recurrent neural networks
- intro: Peking University. The 22nd China Conference on Information Retrieval
- arxiv: https://arxiv.org/abs/1707.07585
Multidimensional LSTM Networks to Predict Bitcoin Price
- blog: http://www.jakob-aungiers.com/articles/a/Multidimensional-LSTM-Networks-to-Predict-Bitcoin-Price
- github: https://github.com/jaungiers/Multidimensional-LSTM-BitCoin-Time-Series
Improving Factor-Based Quantitative Investing by Forecasting Company Fundamentals
- intro: Euclidean Technologies & Amazon AI
- arxiv: https://arxiv.org/abs/1711.04837
Findings from our Research on Applying Deep Learning to Long-Term Investing
http://www.euclidean.com/paper-on-deep-learning-long-term-investing
Predicting Cryptocurrency Prices With Deep Learning
- intro: This post brings together cryptos and deep learning in a desperate attempt for Reddit popularity
- blog: https://dashee87.github.io/deep%20learning/python/predicting-cryptocurrency-prices-with-deep-learning/
Deep Trading Agent
- intro: Deep Reinforcement Learning based Trading Agent for Bitcoin
- arxiv: https://github.com/samre12/deep-trading-agent
Financial Trading as a Game: A Deep Reinforcement Learning Approach
- intro: National Chiao Tung University
- arxiv: https://arxiv.org/abs/1807.02787
Deep Learning in Speech
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
- intro: Baidu Research, ICML 2016
- arxiv: https://arxiv.org/abs/1512.02595
- github(Neon): https://github.com/NervanaSystems/deepspeech
End-to-end speech recognition with neon
WaveNet
WaveNet: A Generative Model for Raw Audio
- homepage: https://deepmind.com/blog/wavenet-generative-model-raw-audio/
- paper: https://drive.google.com/file/d/0B3cxcnOkPx9AeWpLVXhkTDJINDQ/view
- mirror: https://pan.baidu.com/s/1gfmGWaJ
- github: https://github.com/usernaamee/keras-wavenet
- github: https://github.com/ibab/tensorflow-wavenet
- github: https://github.com/monthly-hack/chainer-wavenet
- github: https://github.com/huyouare/WaveNet-Theano
- github(Keras): https://github.com/basveeling/wavenet
- github: https://github.com/ritheshkumar95/WaveNet
A TensorFlow implementation of DeepMind’s WaveNet paper for text generation.
Fast Wavenet Generation Algorithm
- intro: An efficient Wavenet generation implementation
- arxiv: https://arxiv.org/abs/1611.09482
- github https://github.com/tomlepaine/fast-wavenet
Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind’s WaveNet and tensorflow
Wav2Letter: an End-to-End ConvNet-based Speech Recognition System
TristouNet: Triplet Loss for Speaker Turn Embedding
Speech Recognion and Deep Learning
- intro: Baidu Research Silicon Valley AI Lab
- slides: http://cs.stanford.edu/~acoates/ba_dls_speech2016.pdf
- mirror: https://pan.baidu.com/s/1qYrPkPQ
- github: https://github.com/baidu-research/ba-dls-deepspeech
Robust end-to-end deep audiovisual speech recognition
- intro: CMU
- github: https://arxiv.org/abs/1611.06986
An Experimental Comparison of Deep Neural Networks for End-to-end Speech Recognition
Recurrent Deep Stacking Networks for Speech Recognition
- intro: The Ohio State University
- arxiv: https://arxiv.org/abs/1612.04675
Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks
- intro: Universite de Montreal & CIFAR
- arxiv: https://arxiv.org/abs/1701.02720
Deep Learning for Sound / Music
Sound
Suggesting Sounds for Images from Video Collections
- intro: ETH Zurich & 2Disney Research
- paper: https://s3-us-west-1.amazonaws.com/disneyresearch/wp-content/uploads/20161014182443/Suggesting-Sounds-for-Images-from-Video-Collections-Paper.pdf
Disney AI System Associates Images with Sounds
Convolutional Recurrent Neural Networks for Bird Audio Detection
https://arxiv.org/abs/1703.02317
Visual to Sound: Generating Natural Sound for Videos in the Wild
- project page: http://bvision11.cs.unc.edu/bigpen/yipin/visual2sound_webpage/visual2sound.html
- arxiv: https://arxiv.org/abs/1712.01393
Music
Learning Features of Music from Scratch
- intro: University of Washington. MusicNet
- project page: http://homes.cs.washington.edu/~thickstn/musicnet.html
- arxiv: https://arxiv.org/abs/1611.09827
- demo: http://homes.cs.washington.edu/~thickstn/demos.html
DeepBach: a Steerable Model for Bach chorales generation
- project page: http://www.flow-machines.com/deepbach-steerable-model-bach-chorales-generation/
- arxiv: https://arxiv.org/abs/1612.01010
- github: https://github.com/SonyCSL-Paris/DeepBach
- youtube: https://www.youtube.com/watch?v=QiBM7-5hA6o
Deep Learning for Music
First International Workshop on Deep Learning and Music
https://arxiv.org/html/1706.08675
Deep Learning in Medicine and Biology
Low Data Drug Discovery with One-shot Learning
- intro: MIT & Stanford University
- arxiv: https://arxiv.org/abs/1611.03199
- homepage: http://deepchem.io/
- github: https://github.com/deepchem/deepchem
Democratizing Drug Discovery with DeepChem
Introduction to Deep Learning in Medicine and Biology
Deep Learning for Alzheimer Diagnostics and Decision Support
https://amundtveit.com/2016/11/18/deep-learning-for-alzheimer-diagnostics-and-decision-support/
DeepCancer: Detecting Cancer through Gene Expressions via Deep Generative Learning
- intro: University of Florida
- arxiv: https://arxiv.org/abs/1612.03211
Towards biologically plausible deep learning
- intro: Yoshua Bengio, NIPS’2016 Workshops
- slides: http://www.iro.umontreal.ca/~bengioy/talks/Brains+Bits-NIPS2016Workshop.pptx.pdf
Deep Learning and Its Applications to Machine Health Monitoring: A Survey
Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
Deep Learning Applications in Medical Imaging
Dermatologist-level classification of skin cancer with deep neural networks
- intro: Stanford University. Nature 2017
- paper: http://www.nature.com/nature/journal/vaop/ncurrent/pdf/nature21056.pdf
Deep Learning for Health Informatics
- intro: Imperial College London
- paper: http://ieeexplore.ieee.org/abstract/document/7801947/
Deep Learning for Fashion
Convolutional Neural Networks for Fashion Classification and Object Detection
- intro: CS231N project
- paper: http://cs231n.stanford.edu/reports/BLAO_KJAG_CS231N_FinalPaperFashionClassification.pdf
DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations
- intro: CVPR 2016
- project page: http://personal.ie.cuhk.edu.hk/~lz013/projects/DeepFashion.html
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Liu_DeepFashion_Powering_Robust_CVPR_2016_paper.pdf
Deep Learning for Fast and Accurate Fashion Item Detection
- keywords: MultiBox and Fast R-CNN, Kuznech-Fashion-156 and Kuznech-Fashion-205 fashion item detection datasets
- paper: https://kddfashion2016.mybluemix.net/kddfashion_finalSubmissions/Deep%20Learning%20for%20Fast%20and%20Accurate%20Fashion%20Item%20Detection.pdf
Deep Learning at GILT
- keywords: automated tagging, automatic dress faceting
- blog: http://tech.gilt.com/machine/learning,/deep/learning/2016/12/22/deep-learning-at-gilt
Working with Fashion Models
- blog: https://making.lyst.com/2017/02/21/working-with-fashion-models/
- youtube: https://www.youtube.com/watch?v=emr2qaCQOQs
Fashion Forward: Forecasting Visual Style in Fashion
- intro: Karlsruhe Institute of Technology & The University of Texas at Austin
- arxiv: https://arxiv.org/abs/1705.06394
StreetStyle: Exploring world-wide clothing styles from millions of photos
- homepage: http://streetstyle.cs.cornell.edu/
- arxiv: https://arxiv.org/abs/1706.01869
- demo: http://streetstyle.cs.cornell.edu/trends.html
Fashioning with Networks: Neural Style Transfer to Design Clothes
- intro: ML4Fashion 2017
- arxiv: https://arxiv.org/abs/1707.09899
Deep Learning Our Way Through Fashion Week
https://inside.edited.com/deep-learning-our-way-through-fashion-week-ea55bf50bab8
Be Your Own Prada: Fashion Synthesis with Structural Coherence
- intro: ICCV 2017
- paper: http://openaccess.thecvf.com/content_ICCV_2017/papers/Zhu_Be_Your_Own_ICCV_2017_paper.pdf
- github: https://github.com/zhusz/ICCV17-fashionGAN
Others
Selfai: Predicting Facial Beauty in Selfies
Selfai: A Method for Understanding Beauty in Selfies
- blog: http://www.erogol.com/selfai-predicting-facial-beauty-selfies/
- github: https://github.com/erogol/beauty.torch
Deep Learning Enables You to Hide Screen when Your Boss is Approaching
- blog: http://ahogrammer.com/2016/11/15/deep-learning-enables-you-to-hide-screen-when-your-boss-is-approaching/
- github: https://github.com/Hironsan/BossSensor
Blogs
40 Ways Deep Learning is Eating the World
Applications
http://www.deeplearningpatterns.com/doku.php/applications
Systematic Approach To Applications Of Deep Learning
https://gettocode.com/2016/11/25/systematic-approach-to-applications-of-deep-learning/
Resources
Deep Learning Gallery - a curated collection of deep learning projects
http://deeplearninggallery.com/