Deep Learning Software and Hardware

Papers

Accelerating Deep Convolutional Neural Networks Using Specialized Hardware

Installation / Deploying

Setting up a Deep Learning Machine from Scratch (Software): Instructions for setting up the software on your deep learning machine

  • intro: A detailed guide to setting up your machine for deep learning research. Includes instructions to install drivers, tools and various deep learning frameworks. This was tested on a 64 bit machine with Nvidia Titan X, running Ubuntu 14.04
  • github: https://github.com/saiprashanths/dl-setup

How to install CUDA Toolkit and cuDNN for deep learning

Deploying Deep Learning: Guide to deploying deep-learning inference networks and realtime object detection with TensorRT and Jetson TX1.

Install Log

Lessons Learned from Deploying Deep Learning at Scale

Docker

All-in-one Docker image for Deep Learning

NVIDIA Docker: GPU Server Application Deployment Made Easy

Deep learning base image for Docker (Tensorflow, Caffe, MXNet, Torch, Openface, etc.)

https://github.com/dominiek/deep-base

Deepo: a Docker image with a full reproducible deep learning research environment

Cloud

SuperVessel Cloud for POWER/OpenPOWER LoginRegisterTutorials

http://www.ptopenlab.com/

Building Deep Neural Networks in the Cloud with Azure GPU VMs, MXNet and Microsoft R Server

https://blogs.technet.microsoft.com/machinelearning/2016/09/15/building-deep-neural-networks-in-the-cloud-with-azure-gpu-vms-mxnet-and-microsoft-r-server/

Microsoft open sources its next-gen cloud hardware design

Google Taps AMD For Accelerating Machine Learning In The Cloud

http://www.forbes.com/sites/aarontilley/2016/11/15/google-taps-amd-for-accelerating-machine-learning-in-the-cloud/#3549d8554181

Amazon EC2

Deep Learning AMI on AWS Marketplace

https://aws.amazon.com/marketplace/pp/B01M0AXXQB

We Have To Go Deeper: AWS p2.xlarge GPU optimized deep learning cluster-grenade

A GPU enabled AMI for Deep Learning

Keras with GPU on Amazon EC2 – a step-by-step instruction

https://medium.com/@mateuszsieniawski/keras-with-gpu-on-amazon-ec2-a-step-by-step-instruction-4f90364e49ac#.k27d0mqir

Microsoft R Server

Training Deep Neural Networks on ImageNet Using Microsoft R Server and Azure GPU VMs

Hardware System

I: Building a Deep Learning (Dream) Machine

II: Running a Deep Learning (Dream) Machine

A Full Hardware Guide to Deep Learning

Build your own Deep Learning Box

32-TFLOP Deep Learning GPU Box: A super-fast linux-based machine with multiple GPUs for training deep neural nets

https://hackaday.io/project/12070-32-tflop-deep-learning-gpu-box

Hands-on with the NVIDIA DIGITS DevBox for Deep Learning

Considerations when setting up deep learning hardware

Building a Workstation for Deep Learning

Deep Learning Machine: First build experience

Building a machine learning/deep learning workstation for under $5000

Hardware Guide: Neural Networks on GPUs (Updated 2016-1-30)

Building Your Own Deep Learning Box

https://medium.com/@bfortuner/building-your-own-deep-learning-box-47b918aea1eb#.4r5zchk4f

Setting up a Deep learning machine in a lazy yet quick way https://medium.com/@sravsatuluri/setting-up-a-deep-learning-machine-in-a-lazy-yet-quick-way-be2642318850#.jrxrkfxa2

Deep Confusion: Misadventures In Building A Deep Learning Machine

http://www.topbots.com/deep-confusion-misadventures-in-building-a-machine-learning-server/

DIY-Deep-Learning-Workstation

GPU

Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning

从深度学习选择什么样的gpu来谈谈gpu的硬件架构

GPU折腾手记——2015 (by 李沐)

HPC, Deep Learning and GPUs(2016 Stanford HPC Conference)

Modern GPU 2.0: Design patterns for GPU computing

CuMF: CUDA-Acclerated ALS on mulitple GPUs.

Basic Performance Analysis of NVIDIA GPU Accelerator Cards for Deep Learning Applications

CuPy : NumPy-like API accelerated with CUDA

NumPy GPU acceleration

Efficient Convolutional Neural Network Inference on Mobile GPUs (Embedded Vision Summit)

Deep Learning with Multiple GPUs on Rescale: Torch

GPU-accelerated Theano & Keras on Windows 10 native

NVIDIA Announces Quadro GP100 - Big Pascal Comes to Workstations

http://www.anandtech.com/show/11102/nvidia-announces-quadro-gp100

FPGA

Recurrent Neural Networks Hardware Implementation on FPGA

Is implementing deep learning on FPGAs a natural next step after the success with GPUs?

Efficient Implementation of Neural Network Systems Built on FPGAs, Programmed with OpenCL

Deep Learning on FPGAs: Past, Present, and Future

FPGAs Challenge GPUs as a Platform for Deep Learning

Convolution Neural Network CNN Implementation on Altera FPGA using OpenCL

Accelerating Deep Learning Using Altera FPGAs (Embedded Vision Summit)

Machine Learning on FPGAs: Neural Networks

Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs

Microsoft Goes All in for FPGAs to Build Out AI Cloud

Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Intel Unveils FPGA to Accelerate Neural Networks

http://datacenterfrontier.com/intel-unveils-fpga-to-accelerate-ai-neural-networks/

Deep Learning with FPGA

A General Neural Network Hardware Architecture on FPGA

Approximate FPGA-based LSTMs under Computation Time Constraints

ARM / Processor

‘Neural network’ spotted deep inside Samsung’s Galaxy S7 silicon brain: Secrets of Exynos M1 cores spilled

Intel will add deep-learning instructions to its processors

SRAM

ShiDianNao: Shifting Vision Processing Closer to the Sensor http://lap.epfl.ch/files/content/sites/lap/files/shared/publications/DuJun15_ShiDianNaoShiftingVisionProcessingCloserToTheSensor_ISCA15.pdf

Blogs

Emerging “Universal” FPGA, GPU Platform for Deep Learning

An Early Look at Startup Graphcore’s Deep Learning Chip

https://www.nextplatform.com/2017/03/09/early-look-startup-graphcores-deep-learning-chip/

Hardware for Deep Learning

https://medium.com/towards-data-science/hardware-for-deep-learning-8d9b03df41a

Videos

Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural Networks

Published: 09 Oct 2015

Deep Learning Resources

ImageNet

Published: 09 Oct 2015

Deep Learning Frameworks

Amazon DSSTNE

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine

Apache SINGA

Blocks

Blocks: A Theano framework for building and training neural networks

Blocks and Fuel: Frameworks for deep learning

BrainCore

BrainCore: The iOS and OS X neural network framework

https://github.com/aleph7/BrainCore

Brainstorm

Brainstorm: Fast, flexible and fun neural networks

Caffe

Caffe: Convolutional Architecture for Fast Feature Embedding

OpenCL Caffe

Caffe on both Linux and Windows

ApolloCaffe: a fork of Caffe that supports dynamic networks

fb-caffe-exts: Some handy utility libraries and tools for the Caffe deep learning framework

Caffe-Android-Lib: Porting caffe to android platform

caffe-android-demo: An android caffe demo app exploiting caffe pre-trained ImageNet model for image classification

Caffe.js: Run Caffe models in the browser using ConvNetJS

Intel Caffe

  • intro: This fork of BVLC/Caffe is dedicated to improving performance of this deep learning framework when running on CPU, in particular Intel® Xeon processors (HSW+) and Intel® Xeon Phi processors
  • github https://github.com/intel/caffe

NVIDIA Caffe

https://github.com/NVIDIA/caffe

Mini-Caffe

Caffe on Mobile Devices

CaffeOnACL

  • intro: Using ARM Compute Library (NEON+GPU) to speed up caffe; Providing utilities to debug, profile and tune application performance
  • github: https://github.com/OAID/caffeOnACL

Multi-GPU / MPI Caffe

Caffe with OpenMPI-based Multi-GPU support

mpi-caffe: Model-distributed Deep Learning with Caffe and MPI

Caffe-MPI for Deep Learning

Caffe Utils

Caffe-model

Caffe2

Caffe2: A New Lightweight, Modular, and Scalable Deep Learning Framework

CDNN2

CDNN2 - CEVA Deep Neural Network Software Framework

Chainer

Chainer: a neural network framework

Introduction to Chainer: Neural Networks in Python

CNTK

CNTK: Computational Network Toolkit

An Introduction to Computational Networks and the Computational Network Toolkit

http://research.microsoft.com/apps/pubs/?id=226641

ConvNetJS

ConvNetJS: Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser

DeepBeliefSDK

DeepBeliefSDK: The SDK for Jetpac’s iOS, Android, Linux, and OS X Deep Belief image recognition framework

DeepDetect

DeepDetect: Open Source API & Deep Learning Server

Deeplearning4j (DL4J)

Deeplearning4j: Deep Learning for Java

Deeplearning4j images for cuda and hadoop.

Deeplearning4J Examples

DeepLearningKit

DeepLearningKit: Open Source Deep Learning Framework for Apple’s tvOS, iOS and OS X

Tutorial — Using DeepLearningKit with iOS for iPhone and iPad

https://medium.com/@atveit/tutorial-using-deeplearningkit-with-ios-for-iphone-and-ipad-de727679bae4#.1bvnhxhjo

DeepSpark

DeepSpark: Deeplearning framework running on Spark

DIGITS

DIGITS: the Deep Learning GPU Training System

dp

dp: A deep learning library for streamlining research and development using the Torch7 distribution

Dragon

Dragon: A Computation Graph Virtual Machine Based Deep Learning Framework

DyNet

**DyNet: The Dynamic Neural Network Toolkit **

DyNet Benchmarks

IDLF

IDLF: The Intel® Deep Learning Framework

Keras

Keras: Deep Learning library for Theano and TensorFlow

MarcBS/keras fork

Hera: Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser.

Installing Keras for deep learning

Keras Applications - deep learning models that are made available alongside pre-trained weights

https://keras.io/applications/

Keras resources: Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library

Keras.js: Run trained Keras models in the browser, with GPU support

keras2cpp

keras-cn: Chinese keras documents with more examples, explanations and tips.

Kerasify: Small library for running Keras models from a C++ application

https://github.com/moof2k/kerasify

Knet

Knet: Koç University deep learning framework

Lasagne

Lasagne: Lightweight library to build and train neural networks in Theano

Leaf

Leaf: The Hacker’s Machine Learning Engine

LightNet

LightNet: A Versatile, Standalone and Matlab-based Environment for Deep Learning

MatConvNet

MatConvNet: CNNs for MATLAB

Marvin

Marvin: A minimalist GPU-only N-dimensional ConvNet framework

MatConvNet: CNNs for MATLAB

Mocha.jl

Mocha.jl: Deep Learning for Julia

MXNet

MXNet

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

MXNet Model Gallery: Pre-trained Models of DMLC Project

a short introduction to mxnet design and implementation (chinese)

Deep learning for hackers with MXnet (1) GPU installation and MNIST

https://no2147483647.wordpress.com/2015/12/07/deep-learning-for-hackers-with-mxnet-1/

mxnet_Efficient, Flexible Deep Learning Framework

Use Caffe operator in MXNet

Deep Learning in a Single File for Smart Devices

https://mxnet.readthedocs.org/en/latest/tutorial/smart_device.html

MXNet Pascal Titan X benchmark

用MXnet实战深度学习之一:安装GPU版mxnet并跑一个MNIST手写数字识别

http://phunter.farbox.com/post/mxnet-tutorial1

用MXnet实战深度学习之二:Neural art

http://phunter.farbox.com/post/mxnet-tutorial2

Programming Models and Systems Design for Deep Learning

Awesome MXNet

Getting Started with MXNet

https://indico.io/blog/getting-started-with-mxnet/

gtc_tutorial: MXNet Tutorial for NVidia GTC 2016

MXNET Dependency Engine

MXNET是这样压榨深度学习的内存消耗的

WhatsThis-iOS: MXNet WhatThis Example for iOS

MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning

ncnn

neocortex.js

Run trained deep neural networks in the browser or node.js

Neon

Neon: Nervana’s Python-based deep learning library

Tools to convert Caffe models to neon’s serialization format

Nervana’s Deep Learning Course

NNabla

NNabla - Neural Network Libraries by Sony

  • intro: NNabla - Neural Network Libraries NNabla is a deep learning framework that is intended to be used for research, development and production. We aim it running everywhere like desktop PCs, HPC clusters, embedded devices and production servers.
  • homepage: https://nnabla.org/
  • github: https://github.com/sony/nnabla

OpenDeep

OpenDeep: a fully modular & extensible deep learning framework in Python

OpenNN

OpenNN - Open Neural Networks Library

Paddle

PaddlePaddle: PArallel Distributed Deep LEarning

基于Spark的异构分布式深度学习平台

http://geek.csdn.net/news/detail/58867

Petuum

Petuum: a distributed machine learning framework

PlaidML

PlaidML: A framework for making deep learning work everywhere

Platoon

Platoon: Multi-GPU mini-framework for Theano

Poseidon

Poseidon: Distributed Deep Learning Framework on Petuum

Purine

Purine: A bi-graph based deep learning framework

PyTorch

PyTorch

Datasets, Transforms and Models specific to Computer Vision

https://github.com/pytorch/vision/

Convert torch to pytorch

https://github.com/clcarwin/convert_torch_to_pytorch

TensorFlow

TensorFlow

Benchmarks

TensorDebugger (TDB)

TensorDebugger(TDB): Interactive, node-by-node debugging and visualization for TensorFlow

ofxMSATensorFlow: OpenFrameworks addon for Google’s data-flow graph based numerical computation / machine intelligence library TensorFlow.

TFLearn: Deep learning library featuring a higher-level API for TensorFlow

TensorFlow on Spark

TensorBoard

TensorFlow.jl: A Julia wrapper for the TensorFlow Python library

TensorLayer: Deep learning and Reinforcement learning library for TensorFlow

OpenCL support for TensorFlow

Pretty Tensor: Fluent Networks in TensorFlow

Rust language bindings for TensorFlow

TensorFlow Ecosystem: Integration of TensorFlow with other open-source frameworks

Caffe to TensorFlow

TensorFlow Mobile

https://www.tensorflow.org/mobile/

Papers

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

TensorFlow: A system for large-scale machine learning

TensorFlow Distributions

https://arxiv.org/abs/1711.10604

Tutorials

TensorFlow 官方文档中文版

Theano

Theano

Theano-Tutorials: Bare bones introduction to machine learning from linear regression to convolutional neural networks using Theano

Theano: A Python framework for fast computation of mathematical expressions

Configuring Theano For High Performance Deep Learning

http://www.johnwittenauer.net/configuring-theano-for-high-performance-deep-learning/

Theano: a short practical guide

Ian Goodfellow’s Tutorials on Theano

Plato: A library built on top of Theano

Theano Windows Install Guide

Theano-MPI: a Theano-based Distributed Training Framework

tiny-dnn (tiny-cnn)

tiny-dnn: A header only, dependency-free deep learning framework in C++11

Deep learning with C++ - an introduction to tiny-dnn

Torch

Torch

loadcaffe: Load Caffe networks in Torch7

Applied Deep Learning for Computer Vision with Torch

pytorch: Python wrappers for torch and lua

Torch Toolbox: A collection of snippets and libraries for Torch

cltorch: a Hardware-Agnostic Backend for the Torch Deep Neural Network Library, Based on OpenCL

Torchnet: An Open-Source Platform for (Deep) Learning Research

THFFmpeg: Torch bindings for FFmpeg (reading videos only)

caffegraph: Load Caffe networks in Torch7 using nngraph

Optimized-Torch: Intel Torch is dedicated to improving Torch performance when running on CPU

Torch Video Tutorials

Torch in Action

VELES

VELES: Distributed platform for rapid Deep learning application development

WebDNN

WebDNN: Fastest DNN Execution Framework on Web Browser

Yann

Yann: Yet Another Neural Network Toolbox

Benchmarks

Easy benchmarking of all publicly accessible implementations of convnets

https://github.com/soumith/convnet-benchmarks

Stanford DAWN Deep Learning Benchmark (DAWNBench) - An End-to-End Deep Learning Benchmark and Competition

http://dawn.cs.stanford.edu/benchmark/index.html

Tutorials

Deep Learning Implementations and Frameworks (DLIF)

Papers

Comparative Study of Deep Learning Software Frameworks

Benchmarking State-of-the-Art Deep Learning Software Tools

Projects

TensorFuse: Common interface for Theano, CGT, and TensorFlow

DeepRosetta: An universal deep learning models conversor

Deep Learning Model Convertors

https://github.com/ysh329/deep-learning-model-convertor

References

Frameworks and Libraries for Deep Learning

http://creative-punch.net/2015/07/frameworks-and-libraries-for-deep-learning/

TensorFlow vs. Theano vs. Torch

https://github.com/zer0n/deepframeworks/blob/master/README.md

Evaluation of Deep Learning Toolkits

https://github.com/zer0n/deepframeworks/blob/master/README.md

Deep Machine Learning libraries and frameworks

https://medium.com/@abduljaleel/deep-machine-learning-libraries-and-frameworks-5fdf2bb6bfbe#.q1mhj7c36

Torch vs Theano

Deep Learning Software: NVIDIA Deep Learning SDK

https://developer.nvidia.com/deep-learning-software

A comparison of deep learning frameworks

TensorFlow Meets Microsoft’s CNTK

Is there a case for still using Torch, Theano, Brainstorm, MXNET and not switching to TensorFlow?

  • reddit: [https://www.reddit.com/r/MachineLearning/comments/47qh90/is_there_a_case_for_still_using_torch_theano/][https://www.reddit.com/r/MachineLearning/comments/47qh90/is_there_a_case_for_still_using_torch_theano/]

DL4J vs. Torch vs. Theano vs. Caffe vs. TensorFlow

http://deeplearning4j.org/compare-dl4j-torch7-pylearn.html

Popular Deep Learning Libraries

The simple example of Theano and Lasagne super power

https://grzegorzgwardys.wordpress.com/2016/05/15/the-simple-example-of-theano-and-lasagne-super-power/

Comparison of deep learning software

A Look at Popular Machine Learning Frameworks

5 Deep Learning Projects You Can No Longer Overlook

Comparison of Deep Learning Libraries After Years of Use

Deep Learning Part 1: Comparison of Symbolic Deep Learning Frameworks

Deep Learning Frameworks Compared

DL4J vs. Torch vs. Theano vs. Caffe vs. TensorFlow

https://deeplearning4j.org/compare-dl4j-torch7-pylearn.html

Deep Learning frameworks: a review before finishing 2016

https://medium.com/@ricardo.guerrero/deep-learning-frameworks-a-review-before-finishing-2016-5b3ab4010b06#.a6fdrqssl

The Anatomy of Deep Learning Frameworks

https://medium.com/@gokul_uf/the-anatomy-of-deep-learning-frameworks-46e2a7af5e47

Python Deep Learning Frameworks Reviewed

https://indico.io/blog/python-deep-learning-frameworks-reviewed/

Apple’s deep learning frameworks: BNNS vs. Metal CNN

http://machinethink.net/blog/apple-deep-learning-bnns-versus-metal-cnn/

Published: 09 Oct 2015

Deep learning Courses

Deep Learning

EECS 598: Unsupervised Feature Learning

NVIDIA’s Deep Learning Courses

https://developer.nvidia.com/deep-learning-courses

ECE 6504 Deep Learning for Perception

University of Oxford: Machine Learning: 2014-2015

University of Birmingham 2014: Introduction to Neural Computation (Level 4/M); Neural Computation (Level 3/H)(by John A. Bullinaria)

http://www.cs.bham.ac.uk/~jxb/inc.html

CMU: Deep Learning

stat212b: Topics Course on Deep Learning for Spring 2016

Good materials on deep learning

http://eclass.cc/courselists/117_deep_learning

Deep Learning: Course by Yann LeCun at Collège de France 2016(Slides in English)

CSC321 Winter 2015: Introduction to Neural Networks

ELEG 5040: Advanced Topics in Signal Processing (Introduction to Deep Learning)

Self-Study Courses for Deep Learning (NVIDIA Deep Learning Institute)

Introduction to Deep Learning

Deep Learning Courses

Creative Applications of Deep Learning w/ Tensorflow

Deep Learning School: September 24-25, 2016 Stanford, CA

CSC 2541 Fall 2016: Differentiable Inference and Generative Models

CS 294-131: Special Topics in Deep Learning (Fall, 2016)

https://berkeley-deep-learning.github.io/cs294-dl-f16/

Fork of Lempitsky DL for HSE master students.

ELEG 5040: Advanced Topics in Signal Processing (Introduction to Deep Learning)

CS 20SI: Tensorflow for Deep Learning Research

Deep Learning with TensorFlow

https://bigdatauniversity.com/courses/deep-learning-tensorflow/

Deep Learning course

CSE 599G1: Deep Learning System

CSC 321 Winter 2017: Intro to Neural Networks and Machine Learning

http://www.cs.toronto.edu/~rgrosse/courses/csc321_2017/

Theories of Deep Learning (STATS 385)

CS230: Deep Learning Spring 2018

https://web.stanford.edu/class/cs230/

With Video Lectures

Deep Learning: Taking machine learning to the next level (Udacity)

Neural networks class - Université de Sherbrooke

Deep Learning: Theoretical Motivations

University of Waterloo: STAT 946 - Deep Learning

Deep Learning (2016) - BME 595A, Eugenio Culurciello, Purdue University

UVA DEEP LEARNING COURSE

Practical Deep Learning For Coders, Part 1

T81-558:Applications of Deep Neural Networks

CS294-129 Designing, Visualizing and Understanding Deep Neural Networks

MIT 6.S191: Introduction to Deep Learning

Edx: Deep Learning Explained

Computer Vision

Stanford CS231n: Convolutional Neural Networks for Visual Recognition (Spring 2017)

Stanford CS231n: Convolutional Neural Networks for Visual Recognition (Winter 2016)

ITP-NYU - Spring 2016

Deep Learning for Computer Vision Barcelona: Summer seminar UPC TelecomBCN (July 4-8, 2016)

DLCV - Deep Learning for Computer Vision

Advanced Computer Vision Cap6412

Natural Language Processing

CS224n: Natural Language Processing with Deep Learning

Course notes for CS224N Winter17

https://github.com/stanfordnlp/cs224n-winter17-notes

Stanford CS224d: Deep Learning for Natural Language Processing

Code for Stanford CS224D: deep learning for natural language understanding

CMU CS 11-747, Fall 2017: Neural Networks for NLP

Deep Learning for NLP - Lecture October 2015

Harvard University: CS287: Natural Language Processing

http://cs287.fas.harvard.edu/

Deep Learning for Natural Language Processing: 2016-2017

GPU Programming

Course on CUDA Programming on NVIDIA GPUs, July 27–31, 2015

An Introduction to GPU Programming using Theano

GPU Programming

Parallel Programming

Intro to Parallel Programming Using CUDA to Harness the Power of GPUs (Udacity)

https://www.udacity.com/course/intro-to-parallel-programming–cs344

Fundamentals of Accelerated Computing with CUDA C/C++

Workshops

Deep Learning: Theory, Algorithms, and Applications

Resources

Open Source Deep Learning Curriculum

http://www.deeplearningweekly.com/pages/open_source_deep_learning_curriculum

Published: 09 Oct 2015

Deep Learning Applications

Applications

Published: 09 Oct 2015

Acceleration and Model Compression

Papers

Published: 09 Oct 2015

Image / Video Captioning

Papers

Im2Text: Describing Images Using 1 Million Captioned Photographs

Long-term Recurrent Convolutional Networks for Visual Recognition and Description

Show and Tell

Show and Tell: A Neural Image Caption Generator

Image caption generation by CNN and LSTM

Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

Learning a Recurrent Visual Representation for Image Caption Generation

Mind’s Eye: A Recurrent Visual Representation for Image Caption Generation

Deep Visual-Semantic Alignments for Generating Image Descriptions

Deep Captioning with Multimodal Recurrent Neural Networks

Show, Attend and Tell

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention (ICML 2015)

Automatically describing historic photographs


Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images

What value do explicit high level concepts have in vision to language problems?

Aligning where to see and what to tell: image caption with region-based attention and scene factorization

Learning FRAME Models Using CNN Filters for Knowledge Visualization (CVPR 2015)

Generating Images from Captions with Attention

Order-Embeddings of Images and Language

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

Expressing an Image Stream with a Sequence of Natural Sentences

Multimodal Pivots for Image Caption Translation

Image Captioning with Deep Bidirectional LSTMs

Encode, Review, and Decode: Reviewer Module for Caption Generation

Review Network for Caption Generation

Attention Correctness in Neural Image Captioning

Image Caption Generation with Text-Conditional Semantic Attention

DeepDiary: Automatic Caption Generation for Lifelogging Image Streams

phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning

Captioning Images with Diverse Objects

Learning to generalize to new compositions in image understanding

Generating captions without looking beyond objects

SPICE: Semantic Propositional Image Caption Evaluation

Boosting Image Captioning with Attributes

Bootstrap, Review, Decode: Using Out-of-Domain Textual Data to Improve Image Captioning

A Hierarchical Approach for Generating Descriptive Image Paragraphs

Dense Captioning with Joint Inference and Visual Context

Optimization of image description metrics using policy gradient methods

Areas of Attention for Image Captioning

Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning

Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering

Recurrent Highway Networks with Language CNN for Image Captioning

Top-down Visual Saliency Guided by Captions

MAT: A Multimodal Attentive Translator for Image Captioning

https://arxiv.org/abs/1702.05658

Deep Reinforcement Learning-based Image Captioning with Embedding Reward

Attend to You: Personalized Image Captioning with Context Sequence Memory Networks

Punny Captions: Witty Wordplay in Image Descriptions

https://arxiv.org/abs/1704.08224

Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner

https://arxiv.org/abs/1705.00930

Actor-Critic Sequence Training for Image Captioning

  • intro: Queen Mary University of London & Yang’s Accounting Consultancy Ltd
  • keywords: actor-critic reinforcement learning
  • arxiv: https://arxiv.org/abs/1706.09601

What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator?

Stack-Captioning: Coarse-to-Fine Learning for Image Captioning

https://arxiv.org/abs/1709.03376

Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning

https://arxiv.org/abs/1709.05038

Contrastive Learning for Image Captioning

Phrase-based Image Captioning with Hierarchical LSTM Model

Convolutional Image Captioning

https://arxiv.org/abs/1711.09151

Show-and-Fool: Crafting Adversarial Examples for Neural Image Captioning

https://arxiv.org/abs/1712.02051

Improved Image Captioning with Adversarial Semantic Alignment

Object Counts! Bringing Explicit Detections Back into Image Captioning

Defoiling Foiled Image Captions

SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text

Improving Image Captioning with Conditional Generative Adversarial Nets

https://arxiv.org/abs/1805.07112

CNN+CNN: Convolutional Decoders for Image Captioning

https://arxiv.org/abs/1805.09019

Diverse and Controllable Image Captioning with Part-of-Speech Guidance

https://arxiv.org/abs/1805.12589

Learning to Evaluate Image Captioning

Topic-Guided Attention for Image Captioning

Context-Aware Visual Policy Network for Sequence-Level Image Captioning

Exploring Visual Relationship for Image Captioning

Boosted Attention: Leveraging Human Attention for Image Captioning

Image Captioning as Neural Machine Translation Task in SOCKEYE

https://arxiv.org/abs/1810.04101

Unsupervised Image Captioning

https://arxiv.org/abs/1811.10787

Attend More Times for Image Captioning

https://arxiv.org/abs/1812.03283

Object Descriptions

Generation and Comprehension of Unambiguous Object Descriptions

Video Captioning / Description

Jointly Modeling Deep Video and Compositional Text to Bridge Vision and Language in a Unified Framework

Translating Videos to Natural Language Using Deep Recurrent Neural Networks

Describing Videos by Exploiting Temporal Structure

SA-tensorflow: Soft attention mechanism for video caption generation

Sequence to Sequence – Video to Text

Jointly Modeling Embedding and Translation to Bridge Video and Language

Video Description using Bidirectional Recurrent Neural Networks

Bidirectional Long-Short Term Memory for Video Description

3 Ways to Subtitle and Caption Your Videos Automatically Using Artificial Intelligence

Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation

Grounding and Generation of Natural Language Descriptions for Images and Videos

Video Captioning and Retrieval Models with Semantic Attention

  • intro: Winner of three (fill-in-the-blank, multiple-choice test, and movie retrieval) out of four tasks of the LSMDC 2016 Challenge (Workshop in ECCV 2016)
  • arxiv: https://arxiv.org/abs/1610.02947

Spatio-Temporal Attention Models for Grounded Video Captioning

Video and Language: Bridging Video and Language with Deep Learning

Recurrent Memory Addressing for describing videos

Video Captioning with Transferred Semantic Attributes

Adaptive Feature Abstraction for Translating Video to Language

Semantic Compositional Networks for Visual Captioning

Hierarchical Boundary-Aware Neural Encoder for Video Captioning

Attention-Based Multimodal Fusion for Video Description

Weakly Supervised Dense Video Captioning

Generating Descriptions with Grounded and Co-Referenced People

Multi-Task Video Captioning with Video and Entailment Generation

Dense-Captioning Events in Videos

Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning

https://arxiv.org/abs/1706.01231

Reinforced Video Captioning with Entailment Rewards

End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering

From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video Captioning

https://arxiv.org/abs/1708.02478

Grounded Objects and Interactions for Video Captioning

https://arxiv.org/abs/1711.06354

Integrating both Visual and Audio Cues for Enhanced Video Caption

https://arxiv.org/abs/1711.08097

Video Captioning via Hierarchical Reinforcement Learning

https://arxiv.org/abs/1711.11135

Consensus-based Sequence Training for Video Captioning

https://arxiv.org/abs/1712.09532

Less Is More: Picking Informative Frames for Video Captioning

https://arxiv.org/abs/1803.01457

End-to-End Video Captioning with Multitask Reinforcement Learning

https://arxiv.org/abs/1803.07950

End-to-End Dense Video Captioning with Masked Transformer

Reconstruction Network for Video Captioning

Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning

Jointly Localizing and Describing Events for Dense Video Captioning

Contextualize, Show and Tell: A Neural Visual Storyteller

https://arxiv.org/abs/1806.00738

RUC+CMU: System Report for Dense Captioning Events in Videos

Streamlined Dense Video Captioning

Projects

Learning CNN-LSTM Architectures for Image Caption Generation: An implementation of CNN-LSTM image caption generator architecture that achieves close to state-of-the-art results on the MSCOCO dataset.

screengrab-caption: an openframeworks app that live-captions your desktop screen with a neural net

Tools

CaptionBot (Microsoft)

Blogs

Captioning Novel Objects in Images

http://bair.berkeley.edu/jacky/2017/08/08/novel-object-captioning/

Published: 09 Oct 2015

Deep Learning and Autonomous Driving

Courses

(Toronto) CSC2541: Visual Perception for Autonomous Driving, Winter 2016

(MIT) 6.S094: Deep Learning for Self-Driving Cars

How to Land An Autonomous Vehicle Job: Coursework

Papers

An Empirical Evaluation of Deep Learning on Highway Driving

Real-time Joint Object Detection and Semantic Segmentation Network for Automated Driving

Optical Flow augmented Semantic Segmentation networks for Automated Driving

AuxNet: Auxiliary tasks enhanced Semantic Segmentation for Automated Driving

Design of Real-time Semantic Segmentation Decoder for Automated Driving

Hierarchical Multi-task Deep Neural Network Architecture for End-to-End Driving

https://arxiv.org/abs/1902.03466

DeepDriving

DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving

End to End Learning for Self-Driving Cars

End-to-End Deep Learning for Self-Driving Cars


Can we unify monocular detectors for autonomous driving by using the pixel-wise semantic segmentation of CNNs?

BRAIN4CARS: Cabin Sensing for Safe and Personalized Driving

Brain4Cars: Sensory-Fusion Recurrent Neural Models for Driver Activity Anticipation

Brain4Cars: Car That Knows Before You Do via Sensory-Fusion Deep Learning Architecture

Car that Knows Before You Do: Anticipating Maneuvers via Learning Temporal Driving Models

Recurrent Neural Networks for Driver Activity Anticipation via Sensory-Fusion Architecture

Long-term Planning by Short-term Prediction

Learning a Driving Simulator

Comma.ai open-sources the data it used for its first successful driverless trips

Autonomous driving challenge: To Infer the property of a dynamic object based on its motion pattern using recurrent neural network

Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving

Learning from Maps: Visual Common Sense for Autonomous Driving

SAD-GAN: Synthetic Autonomous Driving using Generative Adversarial Networks

  • intro: Accepted at the Deep Learning for Action and Interaction Workshop, 30th Conference on Neural Information Processing Systems (NIPS 2016)
  • arxiv: https://arxiv.org/abs/1611.08788

MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving

Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention

Virtual to Real Reinforcement Learning for Autonomous Driving

Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art

Deep Reinforcement Learning framework for Autonomous Driving

https://arxiv.org/abs/1704.02532

Systematic Testing of Convolutional Neural Networks for Autonomous Driving

https://arxiv.org/abs/1708.03309

MODNet: Moving Object Detection Network with Motion and Appearance for Autonomous Driving

https://arxiv.org/abs/1709.04821

CFENet: An Accurate and Efficient Single-Shot Object Detector for Autonomous Driving

LaneNet: Real-Time Lane Detection Networks for Autonomous Driving

Learning End-to-end Autonomous Driving using Guided Auxiliary Supervision

https://arxiv.org/abs/1808.10393

Rethinking Self-driving: Multi-task Knowledge for Better Generalization and Accident Explanation Ability

Pixel and Feature Level Based Domain Adaption for Object Detection in Autonomous Driving

https://arxiv.org/abs/1810.00345

Multi-task Learning with Attention for End-to-end Autonomous Driving

MP3: A Unified Model to Map, Perceive, Predict and Plan

Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot

Real-time Full-stack Traffic Scene Perception for Autonomous Driving with Roadside Cameras

ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning

Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving

Planning-oriented Autonomous Driving

Projects

Caffe-Autopilot: Car autopilot software that uses C++, BVLC Caffe, OpenCV, and SFML

Self Driving Car Demo

Autoware: Open-source software for urban autonomous driving

Open Sourcing 223GB of Driving Data

Machine Learning for RC Cars

Self Driving (Toy) Ferrari

Lane Finding Project for Self-Driving Car ND

Instructions on how to get your development environment ready for Udacity Self Driving Car (SDC) Challenges

DeepDrive: self-driving car AI

DeepDrive setup: Run a self-driving car simulator from the comfort of your own PC

DeepTesla: End-to-End Learning from Human and Autopilot Driving

http://selfdrivingcars.mit.edu/deeptesla/

DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car

Autonomous Driving in Reality with Reinforcement Learning and Image Translation

End-to-end Multi-Modal Multi-Task Vehicle Control for Self-Driving Cars with Visual Perception

https://arxiv.org/abs/1801.06734

Blogs

Self-driving cars: How far away are we REALLY from autonomous cars?(7 Aug 2015)

http://www.alphr.com/cars/1001329/self-driving-cars-how-far-away-are-we-really-from-autonomous-cars

Practice makes perfect: Driverless cars will learn from their mistakes(9 Oct 2015)

http://www.alphr.com/cars/1001713/practice-makes-perfect-driverless-cars-will-learn-from-their-mistakes

Eyes on the Road: How Autonomous Cars Understand What They’re Seeing

Human-in-the-loop deep learning will help drive autonomous cars

http://venturebeat.com/2016/06/25/human-in-the-loop-deep-learning-will-help-drive-autonomous-cars/

Using reinforcement learning in Python to teach a virtual car to avoid obstacles

Autonomous RC car using Raspberry Pi and Neural Networks

The Road Ahead: Autonomous Vehicles Startup Ecosystem

https://medium.com/the-mission/the-road-ahead-autonomous-vehicles-startup-ecosystem-3c91d546673d#.gft1xyh9l

Deep Driving - A revolutionary AI technique is about to transform the self-driving car

https://www.technologyreview.com/s/602600/deep-driving/

Visualizations for regressing wheel steering angles in self driving cars with Keras

Published: 09 Oct 2015