MPEG Audio Compression-1

7/27/2019 MPEG Audio Compression-1

1/19

MPEG AUDIO COMPRESSION1


2/19

1. PSYCHOACOUSTICS

| Psychoacoustics is the research where you aim tounderstand how the ear and brain interact as varioussounds enter the ear.

|A perceptual audio codec is a codec that takesadvantage of this human characteristic.

| The range of human hearing is about 20 Hz to about 20

kHz

| The easiest frequency range to perceive by humans istypically from about 2 kHz to 4 kHz

| The dynamic range, the ratio of the maximum soundamplitude to the quietest sound that humans can hear,is on the order of about 120 dB

2


3/19

EQUAL-LOUDNESS RELATIONS

| Fletcher-Munson Curves

y Equal loudness curves that display the relationship

,

stimulus sound volume (Sound Pressure Level, alsoin dB), as a function of frequency

| Fig. 1 shows the ears perception of equal loudness:

y

stimulus is required to produce the perception of a 10

dB sound

loudness level gives the same loudness as for that

loudness level of a pure tone at 1 kHz

3


4/19

EQUAL-LOUDNESS RELATIONS

. - -and Dadson)

4


5/19

THRESHOLD OF HEARING

|A plot of the threshold of human hearing for a puretone

Fig. 2: Threshold of human hearing, for pure tones 5


6/19

THRESHOLD OF HEARING (CONTD)

| The threshold of hearing curve: if a sound is above

the dB level shown then the sound is audible

| urn ng up a one so a equa s or surpasses

the curve means that we can then distinguish the

sound

|An approximate formula exists for this curve:

20.8 0.6( /1000 3.3) 3 4Threshold 3.64 /1000 6.5 10 /1000fe = +

y The threshold units are dB; the frequency for the origin

Threshold(f) = 0 at f =2 kHz

6


7/19

FREQUENCYMASKING

| Experiments have shown that the human ear has

24 frequency bands (Basilar membrane).

harder to distinguish by the human ear.

7


8/19

FREQUENCYMASKING

| Suppose there is a dominant tonal component present in an

audio signal. The dominant noise will introduce a masking

threshold that will mask out frequencies in the same critical

band (see Figure below).

| This frequency-domain phenomenon is known as

simultaneous masking, which has been observed within

critical bands.

8


9/19

FREQUENCYMASKING CURVES

| Frequency masking is studied by playing a

particular pure tone, say 1 kHz again, at a loud

,

ability to hear tones nearby in frequency

y one would enerate a 1 kHz maskin tone at a fixed

sound level of 60 dB, and then raise the level of a

nearby tone, e.g., 1.1 kHz, until it is just audible

| The threshold in Fig. 3 plots the audible level for a

single masking tone (1 kHz)

| ig. 4 s ows ow t e p ot c anges i ot er mas ingtones are used

9


10/19


Fig. 3: Effect on threshold for 1 kHz masking tone10


11/19


Fig. 4: Effect of masking tone at three different frequencies

11


12/19

CRITICAL BANDS

| Critical bandwidth represents the ears

resolving power for simultaneous tones or partials

y e ow- requency en , a cr ca an s ess an

100 Hz wide, while for high frequencies the width canbe greater than 4 kHz

|

Experiments indicate that the critical bandwidth:

approximately constant in width ( about 100 Hz)

y for masking frequencies > 500 Hz: increases

approx ma e y near y w requency

12


13/19

. -

13


14/19

14


15/19

BARKUNIT

| Bark unit is defined as the width of one critical

band, for any masking frequency

| e ea o e ar un : every cr ca an w

is roughly equal in terms of Barks (refer to Fig. 5)

g. : ec o mas ng ones, expresse n ar un s 15


16/19

CONVERSION: FREQUENCY& CRITICAL BAND NUMBER

| Conversion expressed in the Bark unit:

2

/100, for 500 ,Critical band number (Bark)

9 4 lo /1000 for 500.%

f f


17/19

TEMPORAL MASKING

| Phenomenon: any loud tone will cause the hearing

receptors in the inner ear to become saturated and

| The following figures show the results of Maskingexperiments:

17

Fig. 6: The louder is the test tone, the shorter it takes for our

hearing to get over hearing the masking.


18/19

TEMPORAL MASKING

| Temporal masking occurs in the time-domain. A stronger

tonal component (masker) will mask a weaker one (maskee)

if they appear within a small interval of time.

| The masking threshold will mask weaker signals pre and

post to the masker.

will last from 50 to 300 ms, depending on the strength and

duration of the masker as shown in Figure below.

18


19/19

TEMPORAL MASKING

Fig. 14.8: For a masking tone that is played for a longer time, it

a es onger e ore a es one can e ear .Solid curve: masking tone played for 200 ms;

dashed curve: masking tone played for 100 ms.19

MPEG Audio Compression-1

Documents

MPEG Audio Compression-1