Audio / Image / Video Generation

Published: 09 Oct 2015 Category: deep_learning

Papers

Optimizing Neural Networks That Generate Images

intro: 2014 PhD thesis
paper : http://www.cs.toronto.edu/~tijmen/tijmen_thesis.pdf
github: https://github.com/mrkulk/Unsupervised-Capsule-Network

Learning to Generate Chairs, Tables and Cars with Convolutional Networks

arxiv: http://arxiv.org/abs/1411.5928

DRAW: A Recurrent Neural Network For Image Generation

intro: Google DeepMind
arxiv: http://arxiv.org/abs/1502.04623
github: https://github.com/vivanov879/draw
github(Theano): https://github.com/jbornschein/draw
github(Lasagne): https://github.com/skaae/lasagne-draw
youtube: https://www.youtube.com/watch?v=Zt-7MI9eKEo&hd=1
video: http://pan.baidu.com/s/1gd3W6Fh

What is DRAW (Deep Recurrent Attentive Writer)?

blog: http://kvfrans.com/what-is-draw-deep-recurrent-attentive-writer/
github(tensorflow): https://github.com/kvfrans/draw

Colorizing the DRAW Model

blog: http://kvfrans.com/colorizing-the-draw-model/
github: https://github.com/kvfrans/draw-color

Understanding and Implementing Deepmind’s DRAW Model

blog: http://evjang.com/articles/draw
github: https://github.com/ericjang/draw

Generative Image Modeling Using Spatial LSTMs

arxiv: http://arxiv.org/abs/1506.03478
github: https://github.com/lucastheis/ride/

Conditional generative adversarial nets for convolutional face generation

Generating Images from Captions with Attention

arxiv: http://arxiv.org/abs/1511.02793
github: https://github.com/emansim/text2image
demo: http://www.cs.toronto.edu/~emansim/cap2im.html

Attribute2Image: Conditional Image Generation from Visual Attributes

intro: University of Michigan & Adobe Research & NEC Labs
project page: https://sites.google.com/site/attribute2image/
arxiv: http://arxiv.org/abs/1512.00570
github(Torch): https://github.com/xcyan/eccv16_attr2img

Autoencoding beyond pixels using a learned similarity metric

arxiv: http://arxiv.org/abs/1512.09300
demo: http://algoalgebra.csa.iisc.ernet.in/deepimagine/
github: https://github.com/andersbll/autoencoding_beyond_pixels
github(Tensorflow): https://github.com/timsainb/Tensorflow-MultiGPU-VAE-GAN
video: http://video.weibo.com/show?fid=1034:f00b4e5a34e8c1ebe78ccd00da95f9e0
github: https://github.com/stitchfix/fauxtograph

Deep Visual Analogy-Making

paper: https://papers.nips.cc/paper/5845-deep-visual-analogy-making
github(Tensorflow): https://github.com/carpedm20/visual-analogy-tensorflow
slides: http://slideplayer.com/slide/9147672/
mirror: http://pan.baidu.com/s/1pKgrdnt

Pixel Recurrent Neural Networks

intro: Google DeepMind. ICML 2016 best paper. PixelRNN
arxiv: http://arxiv.org/abs/1601.06759
github: https://github.com/igul222/pixel_rnn
github(Tensorflow): https://github.com/carpedm20/pixel-rnn-tensorflow
notes(by Hugo Larochelle): https://www.evernote.com/shard/s189/sh/fdf61a28-f4b6-491b-bef1-f3e148185b18/aba21367d1b3730d9334ed91d3250848
video(by Hugo Larochelle): https://www.periscope.tv/hugo_larochelle/1ypKdnMkjBnJW

Generating images with recurrent adversarial networks

arxiv: http://arxiv.org/abs/1602.05110
github: https://github.com/jiwoongim/GRAN

Pixel-Level Domain Transfer

intro: ECCV 2016
github(Torch): https://github.com/fxia22/PixelDTGAN
author page(Code and dataset): https://dgyoo.github.io/

Generative Adversarial Text to Image Synthesis

intro: ICML 2016
arxiv: http://arxiv.org/abs/1605.05396
project page: https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/embeddings-for-image-classification/generative-adversarial-text-to-image-synthesis/
github: https://github.com/reedscot/icml2016
code+dataset: http://datasets.d2.mpi-inf.mpg.de/akata/cub_txt.tar.gz

Conditional Image Generation with PixelCNN Decoders

intro: Google DeepMind. PixelCNN 2.0
arxiv: http://arxiv.org/abs/1606.05328
github(Theano): https://github.com/kundan2510/pixelCNN
gtihub(Torch): https://github.com/dritchie/pixelCNN
github(Tensorflow): https://github.com/anantzoid/Conditional-PixelCNN-decoder

Inverting face embeddings with convolutional neural networks

arxiv: http://arxiv.org/abs/1606.04189
github: https://github.com/pavelgonchar/face-transfer-tensorflow

Unsupervised Cross-Domain Image Generation

intro: Facebook AI Research. Domain Transfer Network (DTN)
arxiv: https://arxiv.org/abs/1611.02200
github(TensorFlow): https://github.com/yunjey/dtn-tensorflow

PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications

intro: OpenAI
arxiv: https://arxiv.org/abs/1701.05517
paper: http://openreview.net/pdf?id=BJrFC6ceg
github: https://github.com/openai/pixel-cnn

Generating Interpretable Images with Controllable Structure

intro: Google DeepMind
paper: http://www.scottreed.info/files/iclr2017.pdf

Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts

arxiv: https://arxiv.org/abs/1612.00215

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space

intro: University of Wyoming & Geometric Intelligence & Montreal Institute for Learning Algorithms & University of Freiburg
project page: http://www.evolvingai.org/ppgn
paper: http://www.evolvingai.org/files/nguyen2016ppgn_v1.pdf
github: https://github.com/Evolving-AI-Lab/ppgn

Image Generation and Editing with Variational Info Generative AdversarialNetworks

arxiv: https://arxiv.org/abs/1701.04568

DeepFace: Face Generation using Deep Learning

arxiv: https://arxiv.org/abs/1701.01876

Multi-View Image Generation from a Single-View

intro: Southwest Jiaotong University & National University of Singapore
arxiv: https://arxiv.org/abs/1704.04886

Generative Cooperative Net for Image Generation and Data Augmentation

https://arxiv.org/abs/1705.02887

Statistics of Deep Generated Images

https://arxiv.org/abs/1708.02688

Sketch-to-Image Generation Using Deep Contextual Completion

https://arxiv.org/abs/1711.08972

Energy-relaxed Wassertein GANs(EnergyWGAN): Towards More Stable and High Resolution Image Generation

https://arxiv.org/abs/1712.01026

Spatial PixelCNN: Generating Images from Patches

https://arxiv.org/abs/1712.00714

Visual to Sound: Generating Natural Sound for Videos in the Wild

intro: University of North Carolina at Chapel Hill & Adobe Research
project page: http://bvision11.cs.unc.edu/bigpen/yipin/visual2sound_webpage/visual2sound.html
arxiv: https://arxiv.org/abs/1712.01393

Semi-supervised FusedGAN for Conditional Image Generation

https://arxiv.org/abs/1801.05551

Image Transformer

intro: Google Brain & UC Berkeley
arxiv: https://arxiv.org/abs/1802.05751

Unpaired Multi-Domain Image Generation via Regularized Conditional GANs

https://arxiv.org/abs/1805.02456

Transferring GANs: generating images from limited data

intro: Universitat Aut`onoma de Barcelona
arxiv: https://arxiv.org/abs/1805.01677
github: https://github.com/yaxingwang/Transferring-GANs

Cross Domain Image Generation through Latent Space Exploration with Adversarial Loss

https://arxiv.org/abs/1805.10130

Face Image Generation

Fader Networks: Manipulating Images by Sliding Attributes

intro: NIPS 2017. Facebook AI Research & Sorbonne Université
arxiv: https://arxiv.org/abs/1706.00409
github: https://github.com//facebookresearch/FaderNetworks

Person Image Generation

Disentangled Person Image Generation

intro: CVPR 2018 spotlight
intro: KU-Leuven/PSI & Max Planck Institute for Informatics & ETH Zurich
arxiv: https://arxiv.org/abs/1712.02621

Pose Guided Person Image Generation

intro: NIPS 2017
arxiv: https://arxiv.org/abs/1705.09368
poster: https://homes.esat.kuleuven.be/~liqianma/NIPS17_PG2/NIPS17_PG2_poster.pdf

Deformable GANs for Pose-based Human Image Generation

intro: University of Trento & Inria Grenoble Rhone-Alpes
arxiv: https://arxiv.org/abs/1801.00055
github: https://github.com/AliaksandrSiarohin/pose-gan

Unpaired Pose Guided Human Image Generation

https://arxiv.org/abs/1901.02284

Video Generation

MoCoGAN: Decomposing Motion and Content for Video Generation

arxiv: https://arxiv.org/abs/1707.04993
github: https://github.com/sergeytulyakov/mocogan
github(PyTorch): https://github.com/DLHacks/mocogan

Attentive Semantic Video Generation using Captions

https://arxiv.org/abs/1708.05980

Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture

intro: AAAI2018. The University of Tokyo
project page: http://www.mi.t.u-tokyo.ac.jp/assets/publication/hierarchical_video_generation_sup/
arxiv: https://arxiv.org/abs/1711.09618

Towards an Understanding of Our World by GANing Videos in the Wild

intro: ETH Zurich
arxiv: https://arxiv.org/abs/1711.11453
github: https://github.com//bernhard2202/improved-video-gan

Video Generation from Single Semantic Label Map

intro: CVPR 2019
arxiv: https://arxiv.org/abs/1903.04480
github: https://github.com/junting/seg2vid

Deep Generative Model

Digit Fantasies by a Deep Generative Model

demo: http://www.dpkingma.com/sgvb_mnist_demo/demo.html

Conditional generative adversarial nets for convolutional face generation

Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

intro: NIPS 2015
project page: http://soumith.ch/eyescream/
homepage: http://www.cs.nyu.edu/~denton/
arxiv: http://arxiv.org/abs/1506.05751
code: http://soumith.ch/eyescream/
notes: http://colinraffel.com/wiki/deep_generative_image_models_using_a_laplacian_pyramid_of_adversarial_networks

Torch convolutional GAN: Generating Faces with Torch

blog: http://torch.ch/blog/2015/11/13/gan.html
github: https://github.com/skaae/torch-gan

One-Shot Generalization in Deep Generative Models

intro: Google DeepMind. ICML 2016
arxiv: http://arxiv.org/abs/1603.05106

Generative Image Modeling using Style and Structure Adversarial Networks

arxiv: http://arxiv.org/abs/1603.05631
github: https://github.com/xiaolonw/ss-gan

Synthesizing Dynamic Textures and Sounds by Spatial-Temporal Generative ConvNet

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

arxiv: http://arxiv.org/abs/1605.09304

ArtGAN: Artwork Synthesis with Conditional Categorial GANs

arxiv: https://arxiv.org/abs/1702.03410

Learning to Generate Chairs with Generative Adversarial Nets

https://arxiv.org/abs/1705.10413

Blogs

Torch convolutional GAN: Generating Faces with Torch

blog: http://torch.ch/blog/2015/11/13/gan.html
github: https://github.com/skaae/torch-gan

Generating Large Images from Latent Vectors

http://blog.otoro.net/2016/04/01/generating-large-images-from-latent-vectors/

Generating Faces with Deconvolution Networks

blog: https://zo7.github.io/blog/2016/09/25/generating-faces.html
github: https://github.com/zo7/facegen

Attention Models in Image and Caption Generation

blog: https://casmls.github.io/general/2016/10/16/attention_model.html

Deconvolution and Checkerboard Artifacts

:star::star::star::star::star:
intro: Google Brain & Université de Montréal
blog: http://distill.pub/2016/deconv-checkerboard/

Projects

Generate cat images with neural networks

github: https://github.com/aleju/cat-generator

TF-VAE-GAN-DRAW

intro: A collection of generative methods implemented with TensorFlow (Deep Convolutional Generative Adversarial Networks (DCGAN), Variational Autoencoder (VAE) and DRAW: A Recurrent Neural Network For Image Generation).
github: https://github.com/ikostrikov/TensorFlow-VAE-GAN-DRAW

Generating Large Images from Latent Vectors

project page: http://blog.otoro.net/2016/04/01/generating-large-images-from-latent-vectors/
github: https://github.com/hardmaru/cppn-gan-vae-tensorflow

Generating Large Images from Latent Vectors - Part Two

Analyzing 50k fonts using deep neural networks

Generate cat images with neural networks

intro: GAN, spatial transformers, weight initialization and LeakyReLUs.
github: https://github.com/aleju/cat-generator

Generate human faces with neural networks

github: https://github.com/aleju/face-generator

A TensorFlow implementation of DeepMind’s WaveNet paper

intro: This is a TensorFlow implementation of the WaveNet generative neural network architecture for image generation.
github: https://github.com/Zeta36/tensorflow-image-wavenet

About me

Hi world~

Recent Posts

Links