Deep Learning Tricks


Practical recommendations for gradient-based training of deep architectures


Efficient BackProp (Neural Networks: Tricks of the Trade, 2nd)

Deep Learning for Vision: Tricks of the Trade

Optimizing RNN performance (Silicon Valley AI Lab)

Must Know Tips/Tricks in Deep Neural Networks

Training Tricks from Deeplearning4j

Suggestions for DL from Llya Sutskeve

Efficient Training Strategies for Deep Neural Network Language Models

Neural Networks Best Practice

Dark Knowledge from Hinton

Stochastic Gradient Descent Tricks(Leon Bottou)

Advice for applying Machine Learning

How to Debug Learning Algorithm for Regression Model

Large-scale L-BFGS using MapReduce

Selecting good features

– Part I: univariate selection: – Part II: linear models and regularization: – Part III: random forests: – Part IV: stability selection, RFE and everything side by side:


Stochastic Gradient Boosting: Choosing the Best Number of Iterations

Large-Scale High-Precision Topic Modeling on Twitter

H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel

How To Improve Deep Learning Performance: 20 Tips, Tricks and Techniques That You Can Use To Fight Overfitting and Get Better Generalization

Neural Network Training Speed Trick

The Black Magic of Deep Learning - Tips and Tricks for the practitioner

Published: 09 Oct 2015

Deep Learning Software and Hardware


Accelerating Deep Convolutional Neural Networks Using Specialized Hardware

Installation / Deploying

Setting up a Deep Learning Machine from Scratch (Software): Instructions for setting up the software on your deep learning machine

  • intro: A detailed guide to setting up your machine for deep learning research. Includes instructions to install drivers, tools and various deep learning frameworks. This was tested on a 64 bit machine with Nvidia Titan X, running Ubuntu 14.04
  • github:

How to install CUDA Toolkit and cuDNN for deep learning

Deploying Deep Learning: Guide to deploying deep-learning inference networks and realtime object detection with TensorRT and Jetson TX1.

Install Log

Lessons Learned from Deploying Deep Learning at Scale


All-in-one Docker image for Deep Learning

NVIDIA Docker: GPU Server Application Deployment Made Easy

Deep learning base image for Docker (Tensorflow, Caffe, MXNet, Torch, Openface, etc.)


SuperVessel Cloud for POWER/OpenPOWER LoginRegisterTutorials

Building Deep Neural Networks in the Cloud with Azure GPU VMs, MXNet and Microsoft R Server

Microsoft open sources its next-gen cloud hardware design

Google Taps AMD For Accelerating Machine Learning In The Cloud

Amazon EC2

Deep Learning AMI on AWS Marketplace

We Have To Go Deeper: AWS p2.xlarge GPU optimized deep learning cluster-grenade

A GPU enabled AMI for Deep Learning

Keras with GPU on Amazon EC2 – a step-by-step instruction

Microsoft R Server

Training Deep Neural Networks on ImageNet Using Microsoft R Server and Azure GPU VMs

Hardware System

I: Building a Deep Learning (Dream) Machine

II: Running a Deep Learning (Dream) Machine

A Full Hardware Guide to Deep Learning

Build your own Deep Learning Box

32-TFLOP Deep Learning GPU Box: A super-fast linux-based machine with multiple GPUs for training deep neural nets

Hands-on with the NVIDIA DIGITS DevBox for Deep Learning

Considerations when setting up deep learning hardware

Building a Workstation for Deep Learning

Deep Learning Machine: First build experience

Building a machine learning/deep learning workstation for under $5000

Hardware Guide: Neural Networks on GPUs (Updated 2016-1-30)

Building Your Own Deep Learning Box

Setting up a Deep learning machine in a lazy yet quick way

Deep Confusion: Misadventures In Building A Deep Learning Machine



Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning


GPU折腾手记——2015 (by 李沐)

HPC, Deep Learning and GPUs(2016 Stanford HPC Conference)

Modern GPU 2.0: Design patterns for GPU computing

CuMF: CUDA-Acclerated ALS on mulitple GPUs.

Basic Performance Analysis of NVIDIA GPU Accelerator Cards for Deep Learning Applications

CuPy : NumPy-like API accelerated with CUDA

NumPy GPU acceleration

Efficient Convolutional Neural Network Inference on Mobile GPUs (Embedded Vision Summit)

Deep Learning with Multiple GPUs on Rescale: Torch

GPU-accelerated Theano & Keras on Windows 10 native

NVIDIA Announces Quadro GP100 - Big Pascal Comes to Workstations


Recurrent Neural Networks Hardware Implementation on FPGA

Is implementing deep learning on FPGAs a natural next step after the success with GPUs?

Efficient Implementation of Neural Network Systems Built on FPGAs, Programmed with OpenCL

Deep Learning on FPGAs: Past, Present, and Future

FPGAs Challenge GPUs as a Platform for Deep Learning

Convolution Neural Network CNN Implementation on Altera FPGA using OpenCL

Accelerating Deep Learning Using Altera FPGAs (Embedded Vision Summit)

Machine Learning on FPGAs: Neural Networks

Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs

Microsoft Goes All in for FPGAs to Build Out AI Cloud

Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Intel Unveils FPGA to Accelerate Neural Networks

Deep Learning with FPGA

ARM / Processor

‘Neural network’ spotted deep inside Samsung’s Galaxy S7 silicon brain: Secrets of Exynos M1 cores spilled

Intel will add deep-learning instructions to its processors


ShiDianNao: Shifting Vision Processing Closer to the Sensor


Emerging “Universal” FPGA, GPU Platform for Deep Learning

An Early Look at Startup Graphcore’s Deep Learning Chip

Hardware for Deep Learning


Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural Networks

Published: 09 Oct 2015

Deep Learning Resources


Published: 09 Oct 2015

Deep Learning On Distribued Systems


Large Scale Distributed Systems for Training Neural Networks


Large Scale Distributed Deep Networks

Implementation of a Practical Distributed Calculation System with Browsers and JavaScript, and Application to Distributed Deep Learning


SparkNet: Training Deep Networks in Spark

A Scalable Implementation of Deep Learning on Spark

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Distributed Supervised Learning using Neural Networks

Distributed Training of Deep Neuronal Networks: Theoretical and Practical Limits of Parallel Scalability

How to scale distributed deep learning?


Theano-MPI: a Theano-based Distributed Training Framework

CaffeOnSpark: Open Sourced for Distributed Deep Learning on Big Data Clusters

Tunnel: Data Driven Framework for Distributed Computing in Torch 7

Distributed deep learning with Keras and Apache Spark

BigDL: Distributed Deep learning Library for Apache Spark


A Scalable Implementation of Deep Learning on Spark

Distributed TensorFlow on Spark: Scaling Google’s Deep Learning Library (Spark Summit)

Deep Recurrent Neural Networks for Sequence Learning in Spark (Spark Summit)

Distributed deep learning on Spark


Hadoop, Spark, Deep Learning Mesh on Single GPU Cluster

The Unreasonable Effectiveness of Deep Learning on Spark

Distributed Deep Learning with Caffe Using a MapR Cluster

Deep Learning with Apache Spark and TensorFlow

Deeplearning4j on Spark

Distributed Deep Learning, Part 1: An Introduction to Distributed Training of Neural Networks

GPU Acceleration in Databricks: Speeding Up Deep Learning on Apache Spark

Distributed Deep Learning with Apache Spark and Keras

Published: 09 Oct 2015

Deep Learning Frameworks


Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine

Apache SINGA


Blocks: A Theano framework for building and training neural networks

Blocks and Fuel: Frameworks for deep learning


BrainCore: The iOS and OS X neural network framework


Brainstorm: Fast, flexible and fun neural networks


Caffe: Convolutional Architecture for Fast Feature Embedding

OpenCL Caffe

Caffe on both Linux and Windows

ApolloCaffe: a fork of Caffe that supports dynamic networks

fb-caffe-exts: Some handy utility libraries and tools for the Caffe deep learning framework

Caffe-Android-Lib: Porting caffe to android platform

caffe-android-demo: An android caffe demo app exploiting caffe pre-trained ImageNet model for image classification

Caffe.js: Run Caffe models in the browser using ConvNetJS

Intel Caffe

  • intro: This fork of BVLC/Caffe is dedicated to improving performance of this deep learning framework when running on CPU, in particular Intel® Xeon processors (HSW+) and Intel® Xeon Phi processors
  • github



Caffe on Mobile Devices


  • intro: Using ARM Compute Library (NEON+GPU) to speed up caffe; Providing utilities to debug, profile and tune application performance
  • github:

Multi-GPU / MPI Caffe

Caffe with OpenMPI-based Multi-GPU support

mpi-caffe: Model-distributed Deep Learning with Caffe and MPI

Caffe-MPI for Deep Learning

Caffe Utils



Caffe2: A New Lightweight, Modular, and Scalable Deep Learning Framework


CDNN2 - CEVA Deep Neural Network Software Framework


Chainer: a neural network framework

Introduction to Chainer: Neural Networks in Python


CNTK: Computational Network Toolkit

An Introduction to Computational Networks and the Computational Network Toolkit


ConvNetJS: Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser


DeepBeliefSDK: The SDK for Jetpac’s iOS, Android, Linux, and OS X Deep Belief image recognition framework


DeepDetect: Open Source API & Deep Learning Server

Deeplearning4j (DL4J)

Deeplearning4j: Deep Learning for Java

Deeplearning4j images for cuda and hadoop.

Deeplearning4J Examples


DeepLearningKit: Open Source Deep Learning Framework for Apple’s tvOS, iOS and OS X

Tutorial — Using DeepLearningKit with iOS for iPhone and iPad


DeepSpark: Deeplearning framework running on Spark


DIGITS: the Deep Learning GPU Training System


dp: A deep learning library for streamlining research and development using the Torch7 distribution


Dragon: A Computation Graph Virtual Machine Based Deep Learning Framework


**DyNet: The Dynamic Neural Network Toolkit **

DyNet Benchmarks


IDLF: The Intel® Deep Learning Framework


Keras: Deep Learning library for Theano and TensorFlow

MarcBS/keras fork

Hera: Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser.

Installing Keras for deep learning

Keras Applications - deep learning models that are made available alongside pre-trained weights

Keras resources: Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library

Keras.js: Run trained Keras models in the browser, with GPU support


keras-cn: Chinese keras documents with more examples, explanations and tips.

Kerasify: Small library for running Keras models from a C++ application


Knet: Koç University deep learning framework


Lasagne: Lightweight library to build and train neural networks in Theano


Leaf: The Hacker’s Machine Learning Engine


LightNet: A Versatile, Standalone and Matlab-based Environment for Deep Learning


MatConvNet: CNNs for MATLAB


Marvin: A minimalist GPU-only N-dimensional ConvNet framework

MatConvNet: CNNs for MATLAB


Mocha.jl: Deep Learning for Julia



MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

MXNet Model Gallery: Pre-trained Models of DMLC Project

a short introduction to mxnet design and implementation (chinese)

Deep learning for hackers with MXnet (1) GPU installation and MNIST

mxnet_Efficient, Flexible Deep Learning Framework

Use Caffe operator in MXNet

Deep Learning in a Single File for Smart Devices

MXNet Pascal Titan X benchmark


用MXnet实战深度学习之二:Neural art

Programming Models and Systems Design for Deep Learning

Awesome MXNet

Getting Started with MXNet

gtc_tutorial: MXNet Tutorial for NVidia GTC 2016

MXNET Dependency Engine


WhatsThis-iOS: MXNet WhatThis Example for iOS



Run trained deep neural networks in the browser or node.js


Neon: Nervana’s Python-based deep learning library

Tools to convert Caffe models to neon’s serialization format

Nervana’s Deep Learning Course


NNabla - Neural Network Libraries by Sony

  • intro: NNabla - Neural Network Libraries NNabla is a deep learning framework that is intended to be used for research, development and production. We aim it running everywhere like desktop PCs, HPC clusters, embedded devices and production servers.
  • homepage:
  • github:


OpenDeep: a fully modular & extensible deep learning framework in Python


OpenNN - Open Neural Networks Library


PaddlePaddle: PArallel Distributed Deep LEarning



Petuum: a distributed machine learning framework


Platoon: Multi-GPU mini-framework for Theano


Poseidon: Distributed Deep Learning Framework on Petuum


Purine: A bi-graph based deep learning framework



*Datasets, Transforms and Models specific to Computer Vision

Convert torch to pytorch




TensorDebugger (TDB)

TensorDebugger(TDB): Interactive, node-by-node debugging and visualization for TensorFlow

ofxMSATensorFlow: OpenFrameworks addon for Google’s data-flow graph based numerical computation / machine intelligence library TensorFlow.

TFLearn: Deep learning library featuring a higher-level API for TensorFlow

TensorFlow on Spark


TensorFlow.jl: A Julia wrapper for the TensorFlow Python library

TensorLayer: Deep learning and Reinforcement learning library for TensorFlow

OpenCL support for TensorFlow

Pretty Tensor: Fluent Networks in TensorFlow

Rust language bindings for TensorFlow

TensorFlow Ecosystem: Integration of TensorFlow with other open-source frameworks

Caffe to TensorFlow

TensorFlow Mobile


TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

TensorFlow: A system for large-scale machine learning


TensorFlow 官方文档中文版



Theano-Tutorials: Bare bones introduction to machine learning from linear regression to convolutional neural networks using Theano

Theano: A Python framework for fast computation of mathematical expressions

Configuring Theano For High Performance Deep Learning

Theano: a short practical guide

Ian Goodfellow’s Tutorials on Theano

Plato: A library built on top of Theano

Theano Windows Install Guide

Theano-MPI: a Theano-based Distributed Training Framework

tiny-dnn (tiny-cnn)

tiny-dnn: A header only, dependency-free deep learning framework in C++11

Deep learning with C++ - an introduction to tiny-dnn



loadcaffe: Load Caffe networks in Torch7

Applied Deep Learning for Computer Vision with Torch

pytorch: Python wrappers for torch and lua

Torch Toolbox: A collection of snippets and libraries for Torch

cltorch: a Hardware-Agnostic Backend for the Torch Deep Neural Network Library, Based on OpenCL

Torchnet: An Open-Source Platform for (Deep) Learning Research

THFFmpeg: Torch bindings for FFmpeg (reading videos only)

caffegraph: Load Caffe networks in Torch7 using nngraph

Optimized-Torch: Intel Torch is dedicated to improving Torch performance when running on CPU

Torch Video Tutorials

Torch in Action


VELES: Distributed platform for rapid Deep learning application development


WebDNN: Fastest DNN Execution Framework on Web Browser


Yann: Yet Another Neural Network Toolbox



Deep Learning Implementations and Frameworks (DLIF)


Comparative Study of Deep Learning Software Frameworks

Benchmarking State-of-the-Art Deep Learning Software Tools


TensorFuse: Common interface for Theano, CGT, and TensorFlow

DeepRosetta: An universal deep learning models conversor


Frameworks and Libraries for Deep Learning

TensorFlow vs. Theano vs. Torch

Evaluation of Deep Learning Toolkits

Deep Machine Learning libraries and frameworks

Torch vs Theano

Deep Learning Software: NVIDIA Deep Learning SDK

A comparison of deep learning frameworks

TensorFlow Meets Microsoft’s CNTK

Is there a case for still using Torch, Theano, Brainstorm, MXNET and not switching to TensorFlow?

  • reddit: [][]

DL4J vs. Torch vs. Theano vs. Caffe vs. TensorFlow

Popular Deep Learning Libraries

The simple example of Theano and Lasagne super power

Comparison of deep learning software

A Look at Popular Machine Learning Frameworks

5 Deep Learning Projects You Can No Longer Overlook

Comparison of Deep Learning Libraries After Years of Use

Deep Learning Part 1: Comparison of Symbolic Deep Learning Frameworks

Deep Learning Frameworks Compared

DL4J vs. Torch vs. Theano vs. Caffe vs. TensorFlow

Deep Learning frameworks: a review before finishing 2016

The Anatomy of Deep Learning Frameworks

Python Deep Learning Frameworks Reviewed

Apple’s deep learning frameworks: BNNS vs. Metal CNN

Published: 09 Oct 2015

Deep learning Courses

Deep Learning

EECS 598: Unsupervised Feature Learning

NVIDIA’s Deep Learning Courses

ECE 6504 Deep Learning for Perception

University of Oxford: Machine Learning: 2014-2015

University of Birmingham 2014: Introduction to Neural Computation (Level 4/M); Neural Computation (Level 3/H)(by John A. Bullinaria)

CMU: Deep Learning

stat212b: Topics Course on Deep Learning for Spring 2016

Good materials on deep learning

Deep Learning: Course by Yann LeCun at Collège de France 2016(Slides in English)

CSC321 Winter 2015: Introduction to Neural Networks

ELEG 5040: Advanced Topics in Signal Processing (Introduction to Deep Learning)

Self-Study Courses for Deep Learning (NVIDIA Deep Learning Institute)

Introduction to Deep Learning

Deep Learning Courses

Creative Applications of Deep Learning w/ Tensorflow

Deep Learning School: September 24-25, 2016 Stanford, CA

CSC 2541 Fall 2016: Differentiable Inference and Generative Models

CS 294-131: Special Topics in Deep Learning (Fall, 2016)

Fork of Lempitsky DL for HSE master students.

ELEG 5040: Advanced Topics in Signal Processing (Introduction to Deep Learning)

CS 20SI: Tensorflow for Deep Learning Research

Deep Learning with TensorFlow

Deep Learning course

CSE 599G1: Deep Learning System

CSC 321 Winter 2017: Intro to Neural Networks and Machine Learning

With Video Lectures

Deep Learning: Taking machine learning to the next level (Udacity)

Neural networks class - Université de Sherbrooke

Deep Learning: Theoretical Motivations

University of Waterloo: STAT 946 - Deep Learning

Deep Learning (2016) - BME 595A, Eugenio Culurciello, Purdue University


Practical Deep Learning For Coders, Part 1

T81-558:Applications of Deep Neural Networks

CS294-129 Designing, Visualizing and Understanding Deep Neural Networks

MIT 6.S191: Introduction to Deep Learning

Edx: Deep Learning Explained

Computer Vision

Stanford CS231n: Convolutional Neural Networks for Visual Recognition (Spring 2017)

Stanford CS231n: Convolutional Neural Networks for Visual Recognition (Winter 2016)

ITP-NYU - Spring 2016

Deep Learning for Computer Vision Barcelona: Summer seminar UPC TelecomBCN (July 4-8, 2016)

DLCV - Deep Learning for Computer Vision

Natural Language Processing

CS224n: Natural Language Processing with Deep Learning

Course notes for CS224N Winter17

Stanford CS224d: Deep Learning for Natural Language Processing

Code for Stanford CS224D: deep learning for natural language understanding

Deep Learning for NLP - Lecture October 2015

Harvard University: CS287: Natural Language Processing

Deep Learning for Natural Language Processing: 2016-2017

GPU Programming

Course on CUDA Programming on NVIDIA GPUs, July 27–31, 2015

An Introduction to GPU Programming using Theano

GPU Programming

Parallel Programming

Intro to Parallel Programming Using CUDA to Harness the Power of GPUs (Udacity)–cs344


Open Source Deep Learning Curriculum

Published: 09 Oct 2015

Deep Learning Applications


Published: 09 Oct 2015

Deep Learning And 3D


Learning Spatiotemporal Features with 3D Convolutional Networks (C3D: Generic Features for Video Analysis)

C3D Model for Keras trained over Sports 1M

Sports 1M C3D Network to Keras

Deep End2End Voxel2Voxel Prediction

Aligning 3D Models to RGB-D Images of Cluttered Scenes

Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images

Multi-view 3D Models from Single Images with a Convolutional Network

Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video

RotationNet: Learning Object Classification Using Unsupervised Viewpoint Estimation


DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding

Volumetric and Multi-View CNNs for Object Classification on 3D Data


Deep3D: Automatic 2D-to-3D Video Conversion with CNNs

Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks


3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction


StereoConvNet: Stereo convolutional neural network for depth map prediction from stereo images

Published: 09 Oct 2015