Audio / Image / Video Generation
Papers
Optimizing Neural Networks That Generate Images
- intro: 2014 PhD thesis
- paper : http://www.cs.toronto.edu/~tijmen/tijmen_thesis.pdf
- github: https://github.com/mrkulk/Unsupervised-Capsule-Network
Learning to Generate Chairs, Tables and Cars with Convolutional Networks
DRAW: A Recurrent Neural Network For Image Generation
- intro: Google DeepMind
- arxiv: http://arxiv.org/abs/1502.04623
- github: https://github.com/vivanov879/draw
- github(Theano): https://github.com/jbornschein/draw
- github(Lasagne): https://github.com/skaae/lasagne-draw
- youtube: https://www.youtube.com/watch?v=Zt-7MI9eKEo&hd=1
- video: http://pan.baidu.com/s/1gd3W6Fh
What is DRAW (Deep Recurrent Attentive Writer)?
- blog: http://kvfrans.com/what-is-draw-deep-recurrent-attentive-writer/
- github(tensorflow): https://github.com/kvfrans/draw
Colorizing the DRAW Model
Understanding and Implementing Deepmind’s DRAW Model
Generative Image Modeling Using Spatial LSTMs
Conditional generative adversarial nets for convolutional face generation
- paper: http://www.foldl.me/uploads/2015/conditional-gans-face-generation/paper.pdf
- blog: http://www.foldl.me/2015/conditional-gans-face-generation/
- github: https://github.com/hans/adversarial
Generating Images from Captions with Attention
- arxiv: http://arxiv.org/abs/1511.02793
- github: https://github.com/emansim/text2image
- demo: http://www.cs.toronto.edu/~emansim/cap2im.html
Attribute2Image: Conditional Image Generation from Visual Attributes
- intro: University of Michigan & Adobe Research & NEC Labs
- project page: https://sites.google.com/site/attribute2image/
- arxiv: http://arxiv.org/abs/1512.00570
- github(Torch): https://github.com/xcyan/eccv16_attr2img
Autoencoding beyond pixels using a learned similarity metric
- arxiv: http://arxiv.org/abs/1512.09300
- demo: http://algoalgebra.csa.iisc.ernet.in/deepimagine/
- github: https://github.com/andersbll/autoencoding_beyond_pixels
- github(Tensorflow): https://github.com/timsainb/Tensorflow-MultiGPU-VAE-GAN
- video: http://video.weibo.com/show?fid=1034:f00b4e5a34e8c1ebe78ccd00da95f9e0
- github: https://github.com/stitchfix/fauxtograph
Deep Visual Analogy-Making
- paper: https://papers.nips.cc/paper/5845-deep-visual-analogy-making
- github(Tensorflow): https://github.com/carpedm20/visual-analogy-tensorflow
- slides: http://slideplayer.com/slide/9147672/
- mirror: http://pan.baidu.com/s/1pKgrdnt
Pixel Recurrent Neural Networks
- intro: Google DeepMind. ICML 2016 best paper. PixelRNN
- arxiv: http://arxiv.org/abs/1601.06759
- github: https://github.com/igul222/pixel_rnn
- github(Tensorflow): https://github.com/carpedm20/pixel-rnn-tensorflow
- notes(by Hugo Larochelle): https://www.evernote.com/shard/s189/sh/fdf61a28-f4b6-491b-bef1-f3e148185b18/aba21367d1b3730d9334ed91d3250848
- video(by Hugo Larochelle): https://www.periscope.tv/hugo_larochelle/1ypKdnMkjBnJW
Generating images with recurrent adversarial networks
- arxiv: http://arxiv.org/abs/1602.05110
- github: https://github.com/jiwoongim/GRAN
Pixel-Level Domain Transfer
- intro: ECCV 2016
- github(Torch): https://github.com/fxia22/PixelDTGAN
- author page(Code and dataset): https://dgyoo.github.io/
Generative Adversarial Text to Image Synthesis
- intro: ICML 2016
- arxiv: http://arxiv.org/abs/1605.05396
- project page: https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/embeddings-for-image-classification/generative-adversarial-text-to-image-synthesis/
- github: https://github.com/reedscot/icml2016
- code+dataset: http://datasets.d2.mpi-inf.mpg.de/akata/cub_txt.tar.gz
Conditional Image Generation with PixelCNN Decoders
- intro: Google DeepMind. PixelCNN 2.0
- arxiv: http://arxiv.org/abs/1606.05328
- github(Theano): https://github.com/kundan2510/pixelCNN
- gtihub(Torch): https://github.com/dritchie/pixelCNN
- github(Tensorflow): https://github.com/anantzoid/Conditional-PixelCNN-decoder
Inverting face embeddings with convolutional neural networks
- arxiv: http://arxiv.org/abs/1606.04189
- github: https://github.com/pavelgonchar/face-transfer-tensorflow
Unsupervised Cross-Domain Image Generation
- intro: Facebook AI Research. Domain Transfer Network (DTN)
- arxiv: https://arxiv.org/abs/1611.02200
- github(TensorFlow): https://github.com/yunjey/dtn-tensorflow
PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications
- intro: OpenAI
- arxiv: https://arxiv.org/abs/1701.05517
- paper: http://openreview.net/pdf?id=BJrFC6ceg
- github: https://github.com/openai/pixel-cnn
Generating Interpretable Images with Controllable Structure
- intro: Google DeepMind
- paper: http://www.scottreed.info/files/iclr2017.pdf
Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts
Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space
- intro: University of Wyoming & Geometric Intelligence & Montreal Institute for Learning Algorithms & University of Freiburg
- project page: http://www.evolvingai.org/ppgn
- paper: http://www.evolvingai.org/files/nguyen2016ppgn_v1.pdf
- github: https://github.com/Evolving-AI-Lab/ppgn
Image Generation and Editing with Variational Info Generative AdversarialNetworks
DeepFace: Face Generation using Deep Learning
Multi-View Image Generation from a Single-View
- intro: Southwest Jiaotong University & National University of Singapore
- arxiv: https://arxiv.org/abs/1704.04886
Generative Cooperative Net for Image Generation and Data Augmentation
https://arxiv.org/abs/1705.02887
Statistics of Deep Generated Images
https://arxiv.org/abs/1708.02688
Sketch-to-Image Generation Using Deep Contextual Completion
https://arxiv.org/abs/1711.08972
Energy-relaxed Wassertein GANs(EnergyWGAN): Towards More Stable and High Resolution Image Generation
https://arxiv.org/abs/1712.01026
Spatial PixelCNN: Generating Images from Patches
https://arxiv.org/abs/1712.00714
Visual to Sound: Generating Natural Sound for Videos in the Wild
- intro: University of North Carolina at Chapel Hill & Adobe Research
- project page: http://bvision11.cs.unc.edu/bigpen/yipin/visual2sound_webpage/visual2sound.html
- arxiv: https://arxiv.org/abs/1712.01393
Semi-supervised FusedGAN for Conditional Image Generation
https://arxiv.org/abs/1801.05551
Image Transformer
- intro: Google Brain & UC Berkeley
- arxiv: https://arxiv.org/abs/1802.05751
Unpaired Multi-Domain Image Generation via Regularized Conditional GANs
https://arxiv.org/abs/1805.02456
Transferring GANs: generating images from limited data
- intro: Universitat Aut`onoma de Barcelona
- arxiv: https://arxiv.org/abs/1805.01677
- github: https://github.com/yaxingwang/Transferring-GANs
Cross Domain Image Generation through Latent Space Exploration with Adversarial Loss
https://arxiv.org/abs/1805.10130
Face Image Generation
Fader Networks: Manipulating Images by Sliding Attributes
- intro: NIPS 2017. Facebook AI Research & Sorbonne Université
- arxiv: https://arxiv.org/abs/1706.00409
- github: https://github.com//facebookresearch/FaderNetworks
Person Image Generation
Disentangled Person Image Generation
- intro: CVPR 2018 spotlight
- intro: KU-Leuven/PSI & Max Planck Institute for Informatics & ETH Zurich
- arxiv: https://arxiv.org/abs/1712.02621
Pose Guided Person Image Generation
- intro: NIPS 2017
- arxiv: https://arxiv.org/abs/1705.09368
- poster: https://homes.esat.kuleuven.be/~liqianma/NIPS17_PG2/NIPS17_PG2_poster.pdf
Deformable GANs for Pose-based Human Image Generation
- intro: University of Trento & Inria Grenoble Rhone-Alpes
- arxiv: https://arxiv.org/abs/1801.00055
- github: https://github.com/AliaksandrSiarohin/pose-gan
Unpaired Pose Guided Human Image Generation
https://arxiv.org/abs/1901.02284
Video Generation
MoCoGAN: Decomposing Motion and Content for Video Generation
- arxiv: https://arxiv.org/abs/1707.04993
- github: https://github.com/sergeytulyakov/mocogan
- github(PyTorch): https://github.com/DLHacks/mocogan
Attentive Semantic Video Generation using Captions
https://arxiv.org/abs/1708.05980
Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture
- intro: AAAI2018. The University of Tokyo
- project page: http://www.mi.t.u-tokyo.ac.jp/assets/publication/hierarchical_video_generation_sup/
- arxiv: https://arxiv.org/abs/1711.09618
Towards an Understanding of Our World by GANing Videos in the Wild
- intro: ETH Zurich
- arxiv: https://arxiv.org/abs/1711.11453
- github: https://github.com//bernhard2202/improved-video-gan
Video Generation from Single Semantic Label Map
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1903.04480
- github: https://github.com/junting/seg2vid
Deep Generative Model
Digit Fantasies by a Deep Generative Model
Conditional generative adversarial nets for convolutional face generation
- paper: http://www.foldl.me/uploads/2015/conditional-gans-face-generation/paper.pdf
- blog: http://www.foldl.me/2015/conditional-gans-face-generation/
- github: https://github.com/hans/adversarial
Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks
- intro: NIPS 2015
- project page: http://soumith.ch/eyescream/
- homepage: http://www.cs.nyu.edu/~denton/
- arxiv: http://arxiv.org/abs/1506.05751
- code: http://soumith.ch/eyescream/
- notes: http://colinraffel.com/wiki/deep_generative_image_models_using_a_laplacian_pyramid_of_adversarial_networks
Torch convolutional GAN: Generating Faces with Torch
One-Shot Generalization in Deep Generative Models
- intro: Google DeepMind. ICML 2016
- arxiv: http://arxiv.org/abs/1603.05106
Generative Image Modeling using Style and Structure Adversarial Networks
Synthesizing Dynamic Textures and Sounds by Spatial-Temporal Generative ConvNet
- project page: http://www.stat.ucla.edu/~jxie/STGConvNet/STGConvNet.html
- paper: http://www.stat.ucla.edu/~jxie/STGConvNet/STGConvNet_file/doc/STGConvNet.pdf
Synthesizing the preferred inputs for neurons in neural networks via deep generator networks
ArtGAN: Artwork Synthesis with Conditional Categorial GANs
Learning to Generate Chairs with Generative Adversarial Nets
https://arxiv.org/abs/1705.10413
Blogs
Torch convolutional GAN: Generating Faces with Torch
Generating Large Images from Latent Vectors
http://blog.otoro.net/2016/04/01/generating-large-images-from-latent-vectors/
Generating Faces with Deconvolution Networks
- blog: https://zo7.github.io/blog/2016/09/25/generating-faces.html
- github: https://github.com/zo7/facegen
Attention Models in Image and Caption Generation
Deconvolution and Checkerboard Artifacts
- :star::star::star::star::star:
- intro: Google Brain & Université de Montréal
- blog: http://distill.pub/2016/deconv-checkerboard/
Projects
Generate cat images with neural networks
TF-VAE-GAN-DRAW
- intro: A collection of generative methods implemented with TensorFlow (Deep Convolutional Generative Adversarial Networks (DCGAN), Variational Autoencoder (VAE) and DRAW: A Recurrent Neural Network For Image Generation).
- github: https://github.com/ikostrikov/TensorFlow-VAE-GAN-DRAW
Generating Large Images from Latent Vectors
- project page: http://blog.otoro.net/2016/04/01/generating-large-images-from-latent-vectors/
- github: https://github.com/hardmaru/cppn-gan-vae-tensorflow
Generating Large Images from Latent Vectors - Part Two
- project page: http://blog.otoro.net/2016/06/02/generating-large-images-from-latent-vectors-part-two/
- github: https://github.com/hardmaru/resnet-cppn-gan-tensorflow
Analyzing 50k fonts using deep neural networks
- blog: https://erikbern.com/2016/01/21/analyzing-50k-fonts-using-deep-neural-networks/
- github: https://github.com/erikbern/deep-fonts
Generate cat images with neural networks
- intro: GAN, spatial transformers, weight initialization and LeakyReLUs.
- github: https://github.com/aleju/cat-generator
Generate human faces with neural networks
A TensorFlow implementation of DeepMind’s WaveNet paper
- intro: This is a TensorFlow implementation of the WaveNet generative neural network architecture for image generation.
- github: https://github.com/Zeta36/tensorflow-image-wavenet