History of the Digital Revolu3on Week #4: Audio Compression and Mp3 File Format
History of the Digital Revolu3on
Week #4: Audio Compression and Mp3 File Format
COMPETING FORMATS
• 33 1/3 and 45’s versus 78 RPM • VHS versus BETA • NTSC & PAL & SECAM • AC/DC • Laserdisc & DVD • DVD versus Blu-‐Ray
• Often: hardware driven
Types of Digital Audio
• Uncompressed • Lossless Compression • Lossy Compression
Types of Digital Audio
• Uncompressed: • WAV “Waveform AudioRile Format (Microsoft/IBM) • AIFF “Audio Interchange File Format” (Apple) • AU (Sun Microsystem) 8k rate • PCM “Pulse Code Modulation”
Types of Digital Audio
• Lossless Compression (2:1): • FLAC “Free Lossless Audio Codec” (2001)
• ATRAC Lossless “Adaptive Transform Acoustic Coding” (Sony) (1999)
• ALAC (.m4a) “Apple Lossless Audio Codec” (Apple) (2004)
• MPEG-‐4 SLS/ALS/DST • Monkey’s Audio (2000) • WMA Lossless (Microsoft (2003)
Types of Digital Audio
• Lossy Compression (~10:1): • Opus (Internet Engineering Task Force) 2012
• MP3 (1993) • Musepack/MPC (based on MP2) (1997) • Vorbis • AAC (Bell/Fraunhofer/Dolby/Sony/Nokia) (1997)
• ATRAC (lossy) (1992) • Musicam (MP2) (1993) • WMA (lossy) (1998)
MP3 Stereo, 16bit audio at 44.1KHz =
44.1k x 16 x 2 = 1411kbits/s (or 172 kbyte/s) Mp3 Standard based on 128kbit/s (or 16kbyte/s)
1411 ÷ 128 = 11:1 compression 160 kbits/s = 9:1 192 kbits/s = 7:1 320 kbits/s = 4.4:1
Mono, 8 bit @ 44.1kHz = 352 kbits/s
128 kbits/s = 2.75:1 160 kbits/s = 2.2:1 192 kbits/s = 1.8:1 320 kbits/s = ~1:1
Audio Compression Audio compression is the reduction in the transmission bandwidth and storage
requirements of audio data. Methods: • Reduce Sampling Frequency
• Nyquist Limit • Reduce Bit Depth (notches in the ruler)
• 8 bit has only 256 possible values • 48dB dynamic range
• 16 bit has 65,536 possible values • 96dB dynamic range (noise Rloor can reduce it to 78dB or ~13
bit) • 24 bit has 1,048,576 possible values!
• 144dB dynamic range (noise Rloor can reduce it to 126dB or ~21bit)
• Perceptual Coding
Audio Compression So…
Size and Quality are directly proportional to each other
SIZE
QUALITY
Perceptual Coding • Doesn’t digitally reproduce the exact signal (as does PCM)
• Models how we listen and hear music • Determines what factors are most important to the quality of music
• Creates a psychoacoustic model of what is and what is not important to the listener in order to perceive quality. • Redundancy • Irrelevancy • Masking • Exploits the limitations of human hearing
Perceptual Coding
Audio Compression
• Audio Algorithms are the “secret sauce” – psychoacoustic
• Codecs contain one or more algorithms
• Containers are a codec with other information (e.g., metadata, tags, headers, etc.)
The Players
• International Organization for Standards (ISO) • Moving Pictures Expert Group (MPEG) • Fraunhofer Institut (aka Fraunhofer Society) • Karlheinz Brandenburg • ATT Bell Labs • Thompson (TMS) • Phillips • Suzanne Vega
Timeline • 1982: Brandenburg is asked to help his PhD advisor to solve the problem of transmission of audio over ISDN lines
• 1988: ISO’s recommendations, MPEG is formed • 1988: First sale of Mp3 encoder (to a radio station in Micronesia Saipan)
• 1991: Musicam (Mp2) chosen over Mp3 by MPEG • 1992: ISO includes MP3 as one of approved codecs • 1994: Fraunhofer releases Rirst Mp3 encoder, L3enc. • 1995: File extension changed from “.bit” to “.mp3” • 1995: First Mp3 player released, WinPlay3 • 1997: Microsoft incorporates Mp3 into Windows Media Player
Timeline • 1997: Hacked, free version of mp3 encoder named “Thank You Fraunhofer” is widely released
• 1997: Mp3.com created • 1997: Winamp released • 1998: First portable players released, MPMan and the Rio.
• 1999: Napster invented • 2000: Standalone version of LAME released • 2003: Pirate Bay created • 2003: Mp3.com sold and effectively shuts down • 2005: Megaupload created • 2009: Pirate Bay founders sent to jail • 2017: Mp3 patents will expire
Audio Tests STAX Headphones
$2000 each
Audio Tests • 1812 Overture (relatively easy to encode) • Tracey Chapman • Gloria Estefan • “Tom’s Diner” (Suzanne Vega)
• Solo instruments very hard to encode without errors (“the lonely voice”)
• Fewer predictable patterns • Subtle variations abound in speech (e.g., implosives, glottal stops, etc.)
• Left/Right Channels so similar that any Rlaws in the encoding are accentuated
• Bell Labs’ volunteer listeners rebelled after listening 100’s of times to the same 4 sec snippet
• Hockey Sounds (instrumental in becoming Rirst large scale licensee in the US by NHL for radio broadcasts)
• Steely Dan (Instrumental in convincing Telos to become Rirst enterprise customer)
Audio Tests • MPEG Audio Tests: • Ornette Coleman Solo • “Fast Car” • Trumpet Solo • Glockenspiel • Fireworks • Two bass solos • Castanet • Newscast • “Tom’s Diner”
Discussion Topics 1. Did the Mp3 Rile format kill the music business?
2. Do you think that quality is important to the average listener?
3. What do you think will happen once the patents expire?
4. Do you think that improvements in bandwidth and storage will make Mp3’s obsolete?