Papers
Accelerating Deep Convolutional Neural Networks Using Specialized Hardware
Installation / Deploying
Setting up a Deep Learning Machine from Scratch (Software): Instructions for setting up the software on your deep learning machine
- intro: A detailed guide to setting up your machine for deep learning research. Includes instructions to install drivers, tools and various deep learning frameworks. This was tested on a 64 bit machine with Nvidia Titan X, running Ubuntu 14.04
- github: https://github.com/saiprashanths/dl-setup
How to install CUDA Toolkit and cuDNN for deep learning
- blog: http://www.pyimagesearch.com/2016/07/04/how-to-install-cuda-toolkit-and-cudnn-for-deep-learning/
Deploying Deep Learning: Guide to deploying deep-learning inference networks and realtime object detection with TensorRT and Jetson TX1.
Install Log
- intro: setting up Caffe on a cluster running Redhat 6.3 (Santiago) without having root
- github: https://github.com/yosinski/caffe/blob/jason_public/doc/linux-no-root-install-log.md
Lessons Learned from Deploying Deep Learning at Scale
Docker
All-in-one Docker image for Deep Learning
- intro: An all-in-one Docker image for deep learning. Contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, etc.)
- github: https://github.com/saiprashanths/dl-docker
NVIDIA Docker: GPU Server Application Deployment Made Easy
- blog: https://devblogs.nvidia.com/parallelforall/nvidia-docker-gpu-server-application-deployment-made-easy/
- github: https://github.com/NVIDIA/nvidia-docker
Deep learning base image for Docker (Tensorflow, Caffe, MXNet, Torch, Openface, etc.)
https://github.com/dominiek/deep-base
Deepo: a Docker image with a full reproducible deep learning research environment
- intro: A Docker image containing almost all popular deep learning frameworks: theano, tensorflow, sonnet, pytorch, keras, lasagne, mxnet, cntk, chainer, caffe, torch.
- project page: https://hub.docker.com/r/ufoym/deepo/
- github: https://github.com//ufoym/deepo
Cloud
SuperVessel Cloud for POWER/OpenPOWER LoginRegisterTutorials
Building Deep Neural Networks in the Cloud with Azure GPU VMs, MXNet and Microsoft R Server
Microsoft open sources its next-gen cloud hardware design
Google Taps AMD For Accelerating Machine Learning In The Cloud
Amazon EC2
Deep Learning AMI on AWS Marketplace
https://aws.amazon.com/marketplace/pp/B01M0AXXQB
We Have To Go Deeper: AWS p2.xlarge GPU optimized deep learning cluster-grenade
- github: https://github.com/Miej/GoDeeper
A GPU enabled AMI for Deep Learning
Keras with GPU on Amazon EC2 – a step-by-step instruction
Microsoft R Server
Training Deep Neural Networks on ImageNet Using Microsoft R Server and Azure GPU VMs
Hardware System
I: Building a Deep Learning (Dream) Machine
- blog: http://graphific.github.io/posts/building-a-deep-learning-dream-machine/
- slides: http://www.slideshare.net/roelofp/building-a-deep-learning-dream-machine
II: Running a Deep Learning (Dream) Machine
A Full Hardware Guide to Deep Learning
Build your own Deep Learning Box
32-TFLOP Deep Learning GPU Box: A super-fast linux-based machine with multiple GPUs for training deep neural nets
https://hackaday.io/project/12070-32-tflop-deep-learning-gpu-box
Hands-on with the NVIDIA DIGITS DevBox for Deep Learning
- blog: http://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/
Considerations when setting up deep learning hardware
- blog: http://www.pyimagesearch.com/2016/06/13/considerations-when-setting-up-deep-learning-hardware/
Building a Workstation for Deep Learning
Deep Learning Machine: First build experience
- blog: https://medium.com/@vivek.yadav/deep-learning-machine-first-build-experience-d04abf198831#.1d6q5mw9m
Building a machine learning/deep learning workstation for under $5000
Hardware Guide: Neural Networks on GPUs (Updated 2016-1-30)
- intro: by Joseph Redmon
- blog: http://pjreddie.com/darknet/hardware-guide/
Building Your Own Deep Learning Box
https://medium.com/@bfortuner/building-your-own-deep-learning-box-47b918aea1eb#.4r5zchk4f
Setting up a Deep learning machine in a lazy yet quick way https://medium.com/@sravsatuluri/setting-up-a-deep-learning-machine-in-a-lazy-yet-quick-way-be2642318850#.jrxrkfxa2
Deep Confusion: Misadventures In Building A Deep Learning Machine
http://www.topbots.com/deep-confusion-misadventures-in-building-a-machine-learning-server/
DIY-Deep-Learning-Workstation
- intro: Build a deep learning workstation from scratch (HW & SW).
- github: https://github.com/charlesq34/DIY-Deep-Learning-Workstation
GPU
Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning
从深度学习选择什么样的gpu来谈谈gpu的硬件架构
GPU折腾手记——2015 (by 李沐)
HPC, Deep Learning and GPUs(2016 Stanford HPC Conference)
Modern GPU 2.0: Design patterns for GPU computing
- intro: Modern GPU is code and commentary intended to promote new and productive ways of thinking about GPU computing.
- homepage: http://nvlabs.github.io/moderngpu/
- github: https://github.com/nvlabs/moderngpu
CuMF: CUDA-Acclerated ALS on mulitple GPUs.
- github: https://github.com/wei-tan/CuMF
Basic Performance Analysis of NVIDIA GPU Accelerator Cards for Deep Learning Applications
CuPy : NumPy-like API accelerated with CUDA
- github: https://github.com/pfnet/cupy
NumPy GPU acceleration
Efficient Convolutional Neural Network Inference on Mobile GPUs (Embedded Vision Summit)
Deep Learning with Multiple GPUs on Rescale: Torch
GPU-accelerated Theano & Keras on Windows 10 native
NVIDIA Announces Quadro GP100 - Big Pascal Comes to Workstations
http://www.anandtech.com/show/11102/nvidia-announces-quadro-gp100
FPGA
Recurrent Neural Networks Hardware Implementation on FPGA
Is implementing deep learning on FPGAs a natural next step after the success with GPUs?
Efficient Implementation of Neural Network Systems Built on FPGAs, Programmed with OpenCL
Deep Learning on FPGAs: Past, Present, and Future
FPGAs Challenge GPUs as a Platform for Deep Learning
- blog: https://www.tractica.com/automation-robotics/fpgas-challenge-gpus-as-a-platform-for-deep-learning/
Convolution Neural Network CNN Implementation on Altera FPGA using OpenCL
Accelerating Deep Learning Using Altera FPGAs (Embedded Vision Summit)
- youtube: https://www.youtube.com/watch?v=HlBC9qBqZRs
- slides: http://www.slideshare.net/embeddedvision/accelerating-deep-learning-using-altera-fpgas-a-presentation-from-intel
Machine Learning on FPGAs: Neural Networks
Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs
Microsoft Goes All in for FPGAs to Build Out AI Cloud
Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks
Intel Unveils FPGA to Accelerate Neural Networks
http://datacenterfrontier.com/intel-unveils-fpga-to-accelerate-ai-neural-networks/
Deep Learning with FPGA
A General Neural Network Hardware Architecture on FPGA
- intro: University of Birmingham
- arxiv: https://arxiv.org/abs/1711.05860
Approximate FPGA-based LSTMs under Computation Time Constraints
- intro: ARC 2018
- arxiv: https://arxiv.org/abs/1801.02190
ARM / Processor
‘Neural network’ spotted deep inside Samsung’s Galaxy S7 silicon brain: Secrets of Exynos M1 cores spilled
Intel will add deep-learning instructions to its processors
SRAM
ShiDianNao: Shifting Vision Processing Closer to the Sensor http://lap.epfl.ch/files/content/sites/lap/files/shared/publications/DuJun15_ShiDianNaoShiftingVisionProcessingCloserToTheSensor_ISCA15.pdf
Blogs
Emerging “Universal” FPGA, GPU Platform for Deep Learning
An Early Look at Startup Graphcore’s Deep Learning Chip
https://www.nextplatform.com/2017/03/09/early-look-startup-graphcores-deep-learning-chip/
Hardware for Deep Learning
https://medium.com/towards-data-science/hardware-for-deep-learning-8d9b03df41a
Videos
Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural Networks
- intro: September 2016 Embedded Vision Alliance Member Meeting Presentation: MIT
- youtube: https://www.youtube.com/watch?v=dO_lHz87DVM