Abstract:Audio signals are sampled at high temporal resolutions, and learning to
synthesize audio requires capturing structure across a range of timescales.
Generative adversarial networks (GANs) have seen wide success at generating
images that are both locally and globally coherent, but they have seen little
application to audio generation. In this paper we introduce WaveGAN, a first
attempt at applying GANs to unsupervised synthesis of raw-waveform audio.
WaveGAN is capable of synthesizing one second slices of ...