Accelerating Deep Convolutional Neural Networks Using Specialized Hardware

Installation / Deploying

Setting up a Deep Learning Machine from Scratch (Software): Instructions for setting up the software on your deep learning machine

  • intro: A detailed guide to setting up your machine for deep learning research. Includes instructions to install drivers, tools and various deep learning frameworks. This was tested on a 64 bit machine with Nvidia Titan X, running Ubuntu 14.04
  • github:

How to install CUDA Toolkit and cuDNN for deep learning

Deploying Deep Learning: Guide to deploying deep-learning inference networks and realtime object detection with TensorRT and Jetson TX1.

Install Log

Lessons Learned from Deploying Deep Learning at Scale


All-in-one Docker image for Deep Learning

NVIDIA Docker: GPU Server Application Deployment Made Easy

Deep learning base image for Docker (Tensorflow, Caffe, MXNet, Torch, Openface, etc.)

Deepo: a Docker image with a full reproducible deep learning research environment


SuperVessel Cloud for POWER/OpenPOWER

Building Deep Neural Networks in the Cloud with Azure GPU VMs, MXNet and Microsoft R Server

Microsoft open sources its next-gen cloud hardware design

Google Taps AMD For Accelerating Machine Learning In The Cloud

Amazon EC2

Deep Learning AMI on AWS Marketplace

We Have To Go Deeper: AWS p2.xlarge GPU optimized deep learning cluster-grenade

A GPU enabled AMI for Deep Learning

Keras with GPU on Amazon EC2 – a step-by-step instruction

Microsoft R Server

Training Deep Neural Networks on ImageNet Using Microsoft R Server and Azure GPU VMs

Hardware System

I: Building a Deep Learning (Dream) Machine

II: Running a Deep Learning (Dream) Machine

A Full Hardware Guide to Deep Learning

Build your own Deep Learning Box

32-TFLOP Deep Learning GPU Box: A super-fast linux-based machine with multiple GPUs for training deep neural nets

Hands-on with the NVIDIA DIGITS DevBox for Deep Learning

Considerations when setting up deep learning hardware

Building a Workstation for Deep Learning

Deep Learning Machine: First build experience

Building a machine learning/deep learning workstation for under $5000

Hardware Guide: Neural Networks on GPUs (Updated 2016-1-30)

Building Your Own Deep Learning Box

Setting up a Deep learning machine in a lazy yet quick way

Deep Confusion: Misadventures In Building A Deep Learning Machine



Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning


GPU折腾手记——2015 (by 李沐)

HPC, Deep Learning and GPUs(2016 Stanford HPC Conference)

Modern GPU 2.0: Design patterns for GPU computing

CuMF: CUDA-Acclerated ALS on mulitple GPUs.

Basic Performance Analysis of NVIDIA GPU Accelerator Cards for Deep Learning Applications

CuPy : NumPy-like API accelerated with CUDA

NumPy GPU acceleration

Efficient Convolutional Neural Network Inference on Mobile GPUs (Embedded Vision Summit)

Deep Learning with Multiple GPUs on Rescale: Torch

GPU-accelerated Theano & Keras on Windows 10 native

NVIDIA Announces Quadro GP100 - Big Pascal Comes to Workstations


Recurrent Neural Networks Hardware Implementation on FPGA

Is implementing deep learning on FPGAs a natural next step after the success with GPUs?

Efficient Implementation of Neural Network Systems Built on FPGAs, Programmed with OpenCL

Deep Learning on FPGAs: Past, Present, and Future

FPGAs Challenge GPUs as a Platform for Deep Learning

Convolution Neural Network CNN Implementation on Altera FPGA using OpenCL

Accelerating Deep Learning Using Altera FPGAs (Embedded Vision Summit)

Machine Learning on FPGAs: Neural Networks

Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs

Microsoft Goes All in for FPGAs to Build Out AI Cloud

Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Intel Unveils FPGA to Accelerate Neural Networks

Deep Learning with FPGA

A General Neural Network Hardware Architecture on FPGA

Approximate FPGA-based LSTMs under Computation Time Constraints

ARM / Processor

‘Neural network’ spotted deep inside Samsung’s Galaxy S7 silicon brain: Secrets of Exynos M1 cores spilled

Intel will add deep-learning instructions to its processors


ShiDianNao: Shifting Vision Processing Closer to the Sensor


Emerging “Universal” FPGA, GPU Platform for Deep Learning

An Early Look at Startup Graphcore’s Deep Learning Chip

Hardware for Deep Learning


Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural Networks