Overlaped subband analysis (usually with the MDCT (Modified
Discrete Cosine Transform). Goes from the temporal to a frequency
domain.
Quantization. Basically, removes pure signals of low amplitude but
taking also into account the SAM (pSycho Acoustic Model) of the HAS
(Human Auditory System). Noise use to be of low power!
Entropy coding. Compress data usually with Huffman/Arithmetic
Coding.
Each transform step inputs
samples and outputs
MDCT coeficients.
can vary depending on the characteristics of the sound. For complex sounds
without clear armonics (such as a plosive sound), shortened windows
improve the performance. For simple sounds (such as a music instrument),
large windows are better.
Determines the correlation between a set of
numbers (samples) and
orthogonal1cosine functions. Therefore, at the input of the DCT there are
samples and at the output,
coefficients.
The MDCT coefficients
of the PCM samples
are defined as:
(1)
7 SAM (pSycho Acoustic Model) of the HAS (Human Auditory System)
This means that humans ear better those sounds that contains audio
signals with frequencies that ranges between 3 KHz and 4 KHz.
7.2 Frequency resolution and simultaneous masking
The HAS has a limited frequency resolution. Psychoacoustic experiments
have demonstrated that the audible frequencies can be grouped into barks.
Each bark defines the group of frequencies that excite the same cochlear
area, i.e., those frequencies that can be masked by the tone with the
highest energy (in that bark).
Most of the time, similar sounds are transported in the channels
of a non-mono audio signal. Channel coupling decreases inter-channel
redundancy, usually, using prediction techniques.
8 Quantization
Depending on the desired output bit-rate and the frequency (see the
ATH model), the SAM applies a different quantization step to barks (see
Section 7.1). Roughly speaking, the higher the compression ratio, the
larger the quantization step and therefore, the quantization noise; and the
higher the frequency, the wider the bark. Notice also that the perception
of a tone in a bark depends also on the temporal masking.
Usually, a variable bit-rate (VBR) lossless encoding algorithm asigns
code-words of less bits to those code-vectors (one or more quantized
MDCT coefficients) with a high probability, and viceversa, producing an
effective reduction of the bit-rate.