colinraffel.com

Audio Codec Improvement Through Noise Substitution

This is the companion site for the AES Paper "Using Noise Substitution for Backwards-Compatible Audio Codec Improvement", which describes the method used in the row-mp3 codec to increase a coded audio file's perceived quality. Roughly speaking, this technique involves representing the residual caused by the coding process as colored noise. Preliminary results indicate that listeners prefer coded files with accompanying noise at equivalent data rates. Further information can be found in the aforementioned paper.

Sound Examples

Below are a collection of files meant to demonstrate the results of the noise substitution algorithm. Each of five audio files are given in a variety of formats meant for quality comparison (all are stereo with a 44.1 kHz sampling rate):

The original 16-bit PCM files
MP3 files coded as stereo 64 and 80 kbps
64 kbps MP3 files with cepstrum- and flux-based error representation, with and without level modulation

The latter group of files are those augmented with the proposed algorithm. They are presented in a 16-bit PCM format to allow the inclusion of the residual representation. This makes their filesize rather large, please allow some time for them to load!

Dance music	Lossless audio	64 kbps MP3	80 kbps MP3	Unmodulated cepstrum estimate	Modulated cepstrum estimate	Unmodulated flux estimate	Modulated flux estimate
Electric Guitar	Lossless audio	64 kbps MP3	80 kbps MP3	Unmodulated cepstrum estimate	Modulated cepstrum estimate	Unmodulated flux estimate	Modulated flux estimate
Acoustic Drums	Lossless audio	64 kbps MP3	80 kbps MP3	Unmodulated cepstrum estimate	Modulated cepstrum estimate	Unmodulated flux estimate	Modulated flux estimate
Hip-hop music	Lossless audio	64 kbps MP3	80 kbps MP3	Unmodulated cepstrum estimate	Modulated cepstrum estimate	Unmodulated flux estimate	Modulated flux estimate
Violin	Lossless audio	64 kbps MP3	80 kbps MP3	Unmodulated cepstrum estimate	Modulated cepstrum estimate	Unmodulated flux estimate	Modulated flux estimate

Code

Here is a Python of the noise substitution techniques discussed herein. The required modules are

Numpy for numerical functions
Scipy for wav file reading and writing (can be replaced by another module with similar functionality)
Pymad for decoding MP3 files

Creation of the requisite MP3 files should be done outside of the programs, preferably using LAME with the commands given by the program, eg lame --resample 44.1 --cbr -b 64 -h filename.wav. This particular encoding format is recommended so that the error extraction and analysis works properly. The files contained are

makeRowMp3.py - Wrapper script for creating example files like those above.
getError.py - Aligns and subtracts the coded and original audio file to obtain the coding residual.
noiseAnalyze.py - Functionality for determining per-band critical levels.
noiseSynthesize.py - Functionality for generating the noise representation.
utility.py - Some useful functions.
README - A file containing some of this information in addition to some example calls.
LICENSE - These source files are distributed under the MAME license for now.

Please contact me if you have any troubles running the code!