See also my description of the complete WAVE format, and a program to convert WAVE files to the canonical format.
Find this useful? Tips are welcome!
The canonical WAVE format starts with the RIFF header:
Offset Length Contents 0 4 bytes 'RIFF' 4 4 bytes <file length - 8> 8 4 bytes 'WAVE'
(The '8' in the second entry is the length of the first two entries. I.e., the second entry is the number of bytes that follow in the file.)
Next, the fmt chunk describes the sample format:
12 4 bytes 'fmt ' 16 4 bytes 0x00000010 // Length of the fmt data (16 bytes) 20 2 bytes 0x0001 // Format tag: 1 = PCM 22 2 bytes <channels> // Channels: 1 = mono, 2 = stereo 24 4 bytes <sample rate> // Samples per second: e.g., 44100 28 4 bytes <bytes/second> // sample rate * block align 32 2 bytes <block align> // channels * bits/sample / 8 34 2 bytes <bits/sample> // 8 or 16
Finally, the data chunk contains the sample data:
36 4 bytes 'data' 40 4 bytes <length of the data block> 44 bytes <sample data>
The sample data must end on an even byte boundary. All numeric data fields are in the Intel format of low-high byte ordering. 8-bit samples are stored as unsigned bytes, ranging from 0 to 255. 16-bit samples are stored as 2's-complement signed integers, ranging from -32768 to 32767.
For multi-channel data, samples are interleaved between channels, like this:
sample 0 for channel 0 sample 0 for channel 1 sample 1 for channel 0 sample 1 for channel 1 ...
For stereo audio, channel 0 is the left channel and channel 1 is the right.
Find this useful? Tips are welcome!