Need for Data Compression - RBVRR Womens Collegerbvrrwomenscollege.net/wp-content/uploads/2018/05/Multimedia-Uni… · Lossless data compression is used in many applications. For
Post on 18-Aug-2020
1 Views
Preview:
Transcript
Need for Data Compression Reducing the amount of data needed to reproduce images or
video (compression) saves storage space, increases access
speed, and is the only way to achieve digital motion video on
personal computers.
In order to compare video compression systems, one must have
ways to evaluate compression performance. Three key
parameters need to be considered:
i. Amount or degree of compression
ii. Image quality
iii. Speed of compression or decompression.
In addition, we must also look at the hardware and software
required by each compression method.
• Compression is useful because it helps reduce resource usage,
such as data storage space or transmission capacity.
• Because compressed data must be decompressed to use, this
extra processing imposes computational or other costs through
decompression; this situation is far from being a free lunch.
Data compression is subject to a space–time complexity trade-
off. For instance, a compression scheme for video may require
expensive hardware for the video to be decompressed fast
enough to be viewed as it is being decompressed, and the
option to decompress the video in full before watching it may
be inconvenient or require additional storage.
The design of data compression schemes involves trade-offs
among various factors, including the degree of compression,
the amount of distortion introduced (e.g., when using lossy data
compression), and the computational resources required to
compress and uncompress the data.[
However, the most important reason for compressing data is
that more and more we share data. The Web and its underlying
networks have limitations on bandwidth that define the
maximum number of bits or bytes that can be transmitted from
one place to another in a fixed amount of time.
Non-lossy and Lossy Compression for images
Image compression may be lossless(non-lossy) or lossy.
Lossless compression means that the reproduced image is not
changed in any way by the compression/decompression
process therefore, we do not have worry about the picture
quality for a lossless system- the output picture will be exactly
the same as the input picture. Lossless compression is
possible because we can use more efficient methods of data
transmission than the pixel-by-pixel PCM (Pulse-Code
Modulation)format that comes from a digitizer.
Lossless compression is preferred for archival purposes and
often for medical imaging, technical drawings, clip art, or
comics.
Lossless compression is used in cases where it is important that
the original and the decompressed data be identical, or where
deviations from the original data could be deleterious. Typical
examples are executable programs, text documents, and source
code.
Some image file formats, like PNG(Portable Network
Graphics) or GIF(Graphic Interface Format), use only lossless
compression, while others like TIFF(Tagged Image File
Format) and MNG(Multimedia Network Group) may use either
lossless or lossy methods.
Lossless data compression is used in many applications. For
example, it is used in the ZIP file format and in the GNU tool
gzip. It is also often used as a component within lossy data
compression technologies (e.g. lossless mid/side joint stereo
preprocessing by the LAME MP3 encoder and other lossy
audio encoders).
One of the reasons that there is still some interest in new loss-
less compression techniques, is that only very inexact data
structures can survive Lossy Compression.
It is often the case that loss of a single bit, renders a whole
phrase or line of data inaccurate. This is why we attempt to
build more and more stable memory systems. The recent shift
from RD to SD ram for instance was partially because RD ram
needed more interactive maintenance of its data.
If we are so protective of the memory of data, then it makes
sense that we must also be protective of the compression
scheme we use to store and retrieve data. So the only places
Lossy Compression can be used, are places where the accuracy
at the bit level, does not materially affect the quality of the
data.
Methods for lossless image compression are:
• Run-length encoding – used as default method in PCX and as
one of possible in BMP, TGA, TIFF
Area image compression
DPCM and Predictive Coding
Entropy encoding
Adaptive dictionary algorithms such as LZW – used in GIF and
TIFF
Deflation– used in PNG, MNG, and TIFF
Chain codes
Lossy compression system by definition do make some change to the image – something is different.
The trick is making that difference hard for the viewer to see. Lossy compression systems may introduce any of the digital video artifacts, or they may even create some unique artifacts if their own.
None of these effects is easy to quantify, and final decisions about compression systems, or about any specific compressed image, will usually have to be made after a subjective evaluation- there‘s not a good alternative to looking at test pictures.
The various measures of analog picture quality- signal-to-noise ratio, resolution, color errors, etc., may be useful in some cases, but only after viewing real pictures to make sure that the right artifacts are being measured.
Lossy compression methods, especially when used at low bit
rates, introduce compression artifacts. Lossy methods are
especially suitable for natural images such as photographs in
applications where minor (sometimes imperceptible) loss of
fidelity is acceptable to achieve a substantial reduction in bit
rate. The lossy compression that produces imperceptible
differences may be called visually lossless.
Lossy compression is most commonly used to compress multimedia data (audio, video, and still images), especially in applications such as streaming media and internet telephony.
By contrast, lossless compression is typically required for text
and data files, such as bank records and text articles. In many
cases it is advantageous to make a master lossless file that can
then be used to produce compressed files for different
purposes.
For example, a multi-megabyte file can be used at full size to
produce a full-page advertisement in a glossy magazine, and a
10 kilobyte lossy copy can be made for a small image on a web
page.
Methods for lossy compression:
Reducing the color space to the most common colors in the
image. The selected colors are specified in the color palette in
the header of the compressed image. Each pixel just references
the index of a color in the color palette, this method can be
combined with dithering to avoid posterization.
Chroma subsampling. This takes advantage of the fact that
the human eye perceives spatial changes of brightness more
sharply than those of color, by averaging or dropping some of
the chrominance information in the image.
Transform coding. This is the most commonly used method.
In particular, a Fourier-related transform such as the Discrete
Cosine Transform (DCT) is widely used.
The DCT is sometimes referred to as "DCT-II" in the context of
a family of discrete cosine transforms; e.g., see discrete cosine
transform. The more recently developed wavelet transform is
also used extensively, followed by quantization and entropy
coding.
Color Adding color to a grayscale image is a neat little effect you see
all over the place. Now, this isn‘t to be confused with taking a
color image and removing its color, only to add some of it back
in certain places. This technique is entirely different. What I‘m
going to show you today is how to apply a new color to a
naked image, so to speak. It‘s a really simple technique that‘s
fun to use, it‘s great for creating visual interest, and drawing
attention to a certain portion of a photo.
Admittedly, the process of colorizing a grayscale photo
certainly seems straight forward enough, in that it
probably involves grabbing a paint brush and painting
color onto the image itself. The problem, though, is that
while you would succeed in adding color to the photo, you
would systematically destroy any detail it once contained.
Using the cutest photo *ever* (snatched from
iStockphoto.com), I‘m going to show you the trick to
adding color while retaining all the glorious detail of the
photo.
The very first thing we want to do is make sure the
document is in color mode, and not grayscale. Else, we
won‘t get very far and your frustration level with all things
digital could reach an all time high.
Step 1: Choose Image > Mode and
make sure the document is set to
either RGB or CMYK. If the
document mode is Grayscale, you
won‘t be allowed you to paint in
color, which can be quite
maddening.
NOTE: If this image will be printed
professionally, then you want to
choose CMYK. If you‘re going to
print the image on your home color
inkjet or if the image is destined to
live out its life only on screen, then
go with RGB.
Step 2: Create a new layer
by clicking the New Layer
button at the bottom of the
Layers Palette. This is where
the new paint will live, so
that we don‘t screw up the
original photo.
Step 3: Change the blending mode of the new layer to
either Color or Overlay, as shown below. This will
allow the detail of the image to show through the
paint, instead of the paint being a solid coat.
Step 4: Press B to select
the Brush tool, and click
on the foreground color
chip at the bottom of the
main Toolbar. Pick a
nice pastel color from
the resulting color picker
and press OK.
TIP: Press Command + (PC:
Ctrl + ) to zoom in, and
Command – (PC: Ctrl -) to
zoom back out of your
document. Another handy tip
to remember while doing
detail work is that while
zoomed in on your
document, pressing the
spacebar turns the cursor
into a little hand which you
can then use to mouse over
to a different area of the
image, like so:
Step 5: Since we‘re about to
embark upon a bit of detail
work, I‘m going to share a
workspace trick with you
before we start painting.
Choose Window > New
Window for [insert image
name]. This is going to allow
us to be zoomed in really far on
the image in one window, and
still see what the image looks
like at its normal size in
another.
The neat bit is what you do in
one window happens
simultaneously in the other.
Step 6: As you move around in the
image, you come upon places where
your brush is too large, such as the
little strap around her neck. Here
you can press the left bracket key, [,
to cycle down in brush size, and
later cycle back up by pressing the
right bracket key, ].
If you mess up during the painting
process, just press E to select the
Eraser tool and fix your mistake.
Press B to pick the brush back up
and soldier on. After painting her
dress, gloves, purse and hat, here‘s
the little cutie all clad in purple:
Step 7: Create an adjustment layer by pressing the half
black/half white circle at the bottom of the Layers Palette, and
choose Hue/Saturation.
Step 8: In the resulting
dialog box, grab the Hue
slider and move it rightward.
Step 9: In our case, I
decided on a peachy
color (it matches my
web site) and to make
the effect a bit more
subtle, I decreased the
Saturation just a tad, as
shown below.
Grayscale and Still-Video Images In photography and computing, a grayscale or greyscale
digital image is an image in which the value of each pixel is a
single sample, that is, it carries only intensity information.
Images of this sort, also known as black-and-white, are
composed exclusively of shades of gray, varying from black at
the weakest intensity to white at the strongest.
Grayscale images are distinct from one-bit bi-tonal black-and-
white images, which in the context of computer imaging are
images with only the two colors, black, and white (also called
bilevel or binary images). Grayscale images have many shades
of gray in between.
Grayscale is a range of shades of gray without apparent
color. The darkest possible shade is black, which is the
total absence of transmitted or reflected light.
The lightest possible shade is white, the total transmission
or reflection of light at all visible wavelengths.
Intermediate shades of gray are represented by equal
brightness levels of the three primary colors (red, green
and blue) for transmitted light, or equal amounts of the
three primary pigments (cyan, magenta and yellow) for
reflected light.
Grayscale images are often the result of measuring the intensity
of light at each pixel in a single band of the electromagnetic
spectrum (e.g. infrared, visible light, ultraviolet, etc.), and in
such cases they are monochromatic proper when only a given
frequency is captured.
But also they can be synthesized from a full color image; see
the section about converting to grayscale.
Still-Video Images You can extract any video frame as a still image to place
wherever you want it to appear in your iMovie project.
To easily add a still frame to the same project where it already
appears as part of a video clip, extract t from the project clip.
If you want to add a still frame from a video clip that isn‘t part
of your project, you can extract it from the source video in any
one of your Events.
To extract a still frame from video:
A)Let the pointer hover over the video frame that you want to extract as a still image.
B)Hold down the Control key and press the mouse button to open the menu, and then choose ―Add still frame to project.‖
C)The image is added to the end of your open project as a four-second clip. (If you created the still frame from a source video clip, the Ken Burns (motion) effect is automatically applied; if you‘ve created the still frame from a project clip, the Ken Burns effect isn‘t applied).
D)To change the duration, click the Duration button ( looks like a clock) that appears in the clip‘s lower-left corner when the pointer moves over the clip; enter a time for the Duration, and
then click OK.
You can easily export a still image from your video file using GoPro Studio. Here is a procedure that details the process:
Step1:Go to Step 2: Edit
Step2:Select the desired video clip so that it is displayed in the Player window.
Step3: Mac instructions - Place the playhead so that the frame you want to export is displayed in the player window, then selecting Share > Export Still.
Step4: Windows Instructions - Place the playhead so that the frame you want to export is displayed in the player window, then selecting File > Export > Still Image.
Step 5: Then just select the image Name, Location and
Size to Export (small, medium, large, & native) and
click Export.
Step 6:Check out the location you specified to verify
that the image was created.
Audio compression Audio compression (data), a type of lossy or lossless
compression in which the amount of data in a recorded
waveform is reduced to differing extents for transmission
respectively with or without some loss of quality, used in CD
and MP3 encoding, Internet radio, and the like
Dynamic range compression, also called audio level
compression, in which the dynamic range, the difference
between loud and quiet, of an audio waveform is reduced
Video and Audio files are very large beasts. Unless we develop
and maintain very high bandwidth networks (Gigabytes per
second or more) we have to compress to data. Relying on
higher bandwidths is not a good option -- M25 Syndrome:
Traffic needs ever increases and will adapt to swamp current
limit whatever this is.
As we will compression becomes part of the representation or
coding scheme which have become popular audio, image and
video formats.
We will first study basic compression algorithms and then go
on to study some actual coding formats.
We have studied the theory of encoding now let us see how this
is applied in practice.
We need to compress video (and audio) in practice since:
1. Uncompressed video (and audio) data are huge. In HDTV,
the bit rate easily exceeds 1 Gbps. -- big problems for storage
and network communications. For example: One of the formats
defined for HDTV broadcasting within the United States is
1920 pixels horizontally by 1080 lines vertically, at 30 frames
per second.
If these numbers are all multiplied together, along with 8 bits
for each of the three primary colors, the total data rate required
would be approximately 1.5 Gb/sec. Because of the 6 MHz.
channel bandwidth allocated, each channel will only support a
data rate of 19.2 Mb/sec, which is further reduced to 18 Mb/sec
by the fact that the channel must also support audio, transport,
and ancillary data information.
As can be seen, this restriction in data rate means that the
original signal must be compressed by a figure of
approximately 83:1.
This number seems all the more impressive when it is realized
that the intent is to deliver very high quality video to the end
user, with as few visible artifacts as possible.
2. Lossy methods have to employed since the compression
ratio of lossless methods (e.g., Huffman, Arithmetic, LZW) is
not high enough for image and video compression, especially
when distribution of pixel values is relatively flat. The
following compression types are commonly used in Video
compression:
Spatial Redundancy Removal - Intraframe coding (JPEG)
Spatial and Temporal Redundancy Removal - Intraframe and
Interframe coding (H.261, MPEG)
These are discussed in the following sections.
Audio compression has become well entrenched in consumer
and professional digital audio products such as the compact
disc (CD), digital versatile disc (DVD), digital audio
broadcasting (DAB) and motion picture experts group (MPEG)
audio layer 3 (MP3) distribution on the Internet.
Audio and speech compression schemes can be conveniently
partitioned into applications reflecting some measure of
acceptable quality, ranging from telephone speech to wideband
audio.
MPEG Layers I, II and III
The International Standards Organisation (ISO) and Motion Picture Experts Group (MPEG) audio coding standard describes audio compression for synchronized audio to accompany the compressed video known as MPEG.
It combines features of MUSICAM (Masking pattern adapted Universal Subband Integrated Coding and Multiplexing) and ASPEC (Adaptive Spectral Perceptual Entropy Coding).
It consists of three layers (codes) of increasing complexity
and improving subjective performance, and it operates with
input sampling rates of for example 32, 44.1 and 48 kHz,
and it outputs bit rates per monophonic channel between 32
and 192 kbit/sec, or per stereophonic channel between 64
and 384 kbit/sec.
The standard supports single channel mode, stereo mode,
dual channel mode (for bilingual audio programs) and an
optional joint stereo mode.
The encoder operates in conjunction with a real-time model
of the human spectral perception threshold.
This threshold is a frequency-dependent boundary or
threshold that marks sound pressure levels (SPL) below
which the human ear cannot detect sounds.
Signal spectral components below the threshold level that
cannot be heard are declared irrelevant, and they are not
encoded in the compression process.
The encoder operation is quite complex!
The Advanced Audio Coding in MPEG-4 Part 3 was
enhanced relative to the previous standard MPEG-2 Part 7,
in order to provide better sound quality for a given
encoding bitrate.
AAC's best known use is as the default audio format of
Apple's iPhone, iPod, iTunes.
AAC's multiple codecs are:
- Low Complexity Advanced Audio Coding (LC-AAC)
- High-Efficiency Advanced Audio Coding (HE-AAC)
- Scalable Sample Rate Advanced Audio Coding (AAC-
SSR)
- Bit Sliced Arithmetic Coding (BSAC)
- Long Term Predictor (LTP)
AAC has been standardized by ISO and IEC, as part of the
MPEG-2 and MPEG-4 specifications. The MPEG-2
standard contains several audio coding methods, including
the MP3 coding scheme. AAC is able to include 48 full-
bandwidth (up to 96 kHz) audio channels in one stream
plus 16 low frequency effects (LFE, limited to 120 Hz)
channels, up to 16 "coupling" or dialog channels, and up to
16 data streams.
The quality for stereo is satisfactory to modest
requirements at 96 kbit/s in joint stereo mode, however hi-
fi transparency demands data rates of at least 128kbit/s. The
MPEG-2 audio tests showed that AAC meets the
requirements referred to as "transparent" for the ITU at 128
kbit/s for stereo, and 320kbit/s for 5.1 audio.
Simple Audio Compression Methods Traditional lossless compression methods (Huffman, LZW,
etc.) usually don't work well on audio compression (the same
reason as in image compression).
The following are some of the Lossy methods applied to audio
compression:
Silence Compression - detect the "silence", similar to run-
length coding .
Adaptive Differential Pulse Code Modulation (ADPCM)
e.g., in CCITT G.721 - 16 or 32 Kbits/sec.
(a) encodes the difference between two consecutive signals,
(b) adapts at quantization so fewer bits are used when the value
is smaller.
It is necessary to predict where the waveform is headed ->
difficult
Apple has proprietary scheme called ACE/MACE. Lossy
scheme that tries to predict where wave will go in next
sample. About 2:1 compression.
Linear Predictive Coding (LPC) fits signal to speech model and
then transmits parameters of model. Sounds like a computer
talking, 2.4 kbits/sec.
Code Excited Linear Predictor (CELP) does LPC, but also
transmits error term - audio conferencing quality at 4.8
kbits/sec.
JPEG Standard In computing, JPEG (seen most often with the .jpg or .jpeg
filename extension) is a commonly used method of lossy
compression for digital images, particularly for those images
produced by digital photography.
The degree of compression can be adjusted, allowing a
selectable tradeoff between storage size and image quality. JPEG
typically achieves 10:1 compression with little perceptible loss
in image quality.
JPEG compression is used in a number of image file formats.
JPEG/Exif is the most common image format used by digital
cameras and other photographic image capture devices; along
with JPEG/JFIF, it is the most common format for storing and
transmitting photographic images on the World Wide Web.
These format variations are often not distinguished, and are
simply called JPEG.
The term "JPEG" is an acronym for the Joint Photographic
Experts Group, which created the standard. The MIME media
type for JPEG is image/jpeg (defined in RFC 1341), except in
older Internet Explorer versions, which provides a MIME type
of image/pjpeg when uploading JPEG images.
JPEG/JFIF supports a maximum image size of 65535×65535
pixels – one to four gigapixels (1000 megapixels), depending
on aspect ratio (from panoramic 3:1 to square).
"JPEG" stands for Joint Photographic Experts Group, the name
of the committee that created the JPEG standard and also other
still picture coding standards. The "Joint" stood for ISO TC97
WG8 and CCITT SGVIII. In 1987 ISO TC 97 became ISO/IEC
JTC1 and in 1992 CCITT became ITU-T. Currently on the
JTC1 side JPEG is one of two sub-groups of ISO/IEC Joint
Technical Committee 1, Subcommittee 29, Working Group 1
(ISO/IEC JTC 1/SC 29/WG 1) – titled as Coding of still
pictures.On the ITU-T side ITU-T SG16 is the respective body.
The original JPEG group was organized in 1986, issuing the
first JPEG standard in 1992, which was approved in September
1992 as ITU-T Recommendation T.81 and in 1994 as
ISO/IEC 10918-1.
The JPEG standard specifies the codec, which defines how an
image is compressed into a stream of bytes and decompressed
back into an image, but not the file format used to contain that
stream.
The Exif and JFIF standards define the commonly used file
formats for interchange of JPEG-compressed images.
JPEG standards are formally named as Information
technology – Digital compression and coding of continuous-
tone still images.
The JPEG standards includes:
Objectives
Architecture
DCT Encoding and Quantization
Statistical Coding
Predictive Lossless Coding, and
Performance
JPEG Objectives
JPEG under took to develop a single standard applicable to the
still-imaging needs of a wide range of applications in all the
different industries that might use digital continuous-tone
imaging.
The scope if this is bet seen by listing the objectives in detail:
1. To be at or near the state of the art for degree of compression
versus image quality,
2. To be parameterizable so that the user can select the desired
compression versus quality tradeoff,
3.To be applicable to practically any kind of source image,
without regard to dimensions, image content, aspect ratio, etc.,
4. To have computational requirements that are reasonable for
both hardware or software implementation, and
5. To support four different modes of operation:
(a) sequential encoding, where each image component is
encoded in the same order that it was scanned;
(b) progressive encoding where the image is encoded in
multiple passes so that a coarse image is presented rapidly,
followed by repeated imaged showing greater and greater deal;
(c) lossless encoding, where the encoding guarantees exact
reproduction of all the data in the source image,
(d) hierarchical encoding, where the image is encoded at
multiple resolutions.
These objectives were extremely ambitious, yet they are largely
met by the completed, which is testimony to the excellent work
of the JPEG commettee.
JPEG- Architectures The lossy modes of operation (a,b,d) are implemented
with DCT encoding of 8 x 8 pixel blocks, followed by one of two statistical coding methods, while the lossless option (c) is complemented with simple predictive coding followed by statistical coding. This is shown in the following figures:
Compressed
Source | image
image Bit Stream
Data
Fig 1(a): Sequential coding block diagram
DCT
EncoderQuantizer
Quantizing
table
Table
Specification
Zig-Zag
Ordering
Statistical
coding
Figures: Progressive encoding, Lossless coding, and
hierarchical coding block diagrams
Hierarchical
Control
DCT
EncoderQuantizer
Zig-Zag
Ordering
Statistical
Coding
Quantizing
TableTable
Specification
Figure 1(b)
Figure 1(d)
Figure 1(c)
The architecture shown in previous figures apply to a single gray
scale image or to one of the components of a color image.
To compress a color image, the color image components can be
either completely compressed one after another, or the three
components can be interleaved for each block of the image.
In the case of sequential-mode encoding, DCT encoding is done on
the blocks of the image as they are scanned, and the DCT coefficient
output is transmitted block by block in the same order.
For progressive-mode encoding, an image buffer is added after the
DCT encoding step. The progressive-mode behavior is obtained by
reading out different portions of the DCT coefficients to achieve
progressively improved quality over several scans.
For hierarchical-mode encoding, processing is added ahead of the
DCT encoder to filter and subsample the source image before
encoding. This sub-sampling and encoding is done repeatedly with
progressively less sub-sampling to transmit images of increasing
resolution one after another
JPEG-DCT Encoding and Quantization The output if the DCT encoder (the DCT coefficients) is shown
in figure 2(a) as 2-D array with the DC coefficient in the upper
left corner, and the AC coefficients arranged with increasing
spatial frequency horizontally and vertically.
These components are quantized according to 64-entry table,
which must be specified to the encoder by the application.
The quantization table has 8 bits per entry and specifies the step
size of quantizing for each DCT coefficient.
This allows each coefficient to be represented with no more
precision than is necessary to achieve the desired image quality.
The standard does not specify any quantization tables; these must
be provided by the application and will become part of the data
stream, so the decoder knows what table was used.
Therefore,, modification of the quantization table specified during
encoding is one way to vary the degree of compression.
After quantization, the DC coefficient is treated differently from the
AC coefficients. Because there is usually a strong correlation
between DC coefficients of adjacent 8 x 8 blocks, the DC coefficient
is encoded as the difference from the previous block in the encoding
sequence.
Figure 1(a) also shows a step called zig-zag ordering between the
quantizer and the statistical coder. This is an important step, which
arranges the DCT coefficients so that the statistical coding will be
more effective. It is diagrammed in Figure 2(b).
In order to create a bit-stream where coefficients that are more likely
to be nonzero (low frequency ones) are placed before coefficients
that are more likely to be zero (high-frequency ones), the zig-zag
sequence shown by Figure 2(b) is used to read the coefficients into
the bit-stream. The result is that all the zero-value coefficients tend
to be together at the end of the block and can be transmitted with
vary few bits using a simple run-length code.
DCT coefficients Zig-Zag sequence
for one 8 x8 block
Horizontal spatial frequency
0 7
Horizontal spatial frequency
Vertical 0 Start 0 7
Spatial 7 0
Frequency Vertical
Spatial
Frequency
7 7
Fig 2(a):2-D matrix of DCT coeff. Fig 2(b): Zig-Zag ordering end
JPEG-Statistical Coding
The final encoder processing step is statistical coding. It
achieves lossless compression by encoding the quantized DCT
coefficients more efficiently based on their statistical
characteristics.
The JPEG standard allows two types of statistical coding-
Huffman coding or arithmetic coding. The Baseline sequential
coder only Huffman statistical coding.
Huffman coding(a Huffman code is an optimal prefix code
found using the algorithm developed by David A. Huffman)requires specification of the Huffman table or code block this is
the job of the application, the standard does not specify a table
except for the Baseline coder.
The Huffman table also become part of the image bit-stream;
the standard supports up to four Huffman tables per image, to
provide for different tables for each component of a multi-
component image.
The arithmetic statistical coding option does not require a
separate table to be provided, but it does require a little more
processing for implementation. However, it result in a little
more processing better compression (5 to 10 percent) for many
images.
JPEG-Predictive Lossless Coding The Loss compression option, does not use DCT.
Instead, a simple predictor is used, but there is a choice of
seven different kinds of prediction available.
The different predictor choices specify how many and which
adjacent pixels are used to predict the next pixel.
The statistical coding in the lossless mode can use either of the
two methods specified for the DCT modes, and is similar to
what is specified for the DC coefficient of the DCT modes.
The lossless compression will work with source images having
from 2 to 16 bpp, and typically around2:1 compression for
photographic color images.
JPEG-Performance Compression performance is best specified by relating image
quality to bits per pixel in the compressed data stream.
This relationship depends to some degree on the characteristics
of the source image- some images are harder to compress
successfully than others. With this in mind, here are some
figures for ‗typical‘ source images[2].
0.25-0.5 bpp; moderate to good quality, sufficient for some
applications;
05-075 bpp: good to very good quality, sufficient for many
applications;
0.75-15 bpp; excellent quality, sufficient for most applications;
15-20 bpp; :undistinguishable from the original, sufficient for
the most demanding applications.
MPEG Standard Digital motion video can be accomplished with the JPEG still-
image standard if you have fast enough hardware to process 30
images per second.
However, the maximum compression potential cannot be
achieved because the redundancy between frames is not being
exploited.
Furthermore, there are many other things to be considered in
compressing and decompressing motion video, as indicated in
the objectives.
The MPEG standards includes:
1.Objectives 3.Bitstream 5. MPEG-2 and MPEG-4
2. Architecture 4. Performance
MPEG Objectives As with the JPEG standard, the MPEG standard is intended to
be generic, meaning that it will support the needs of many
applications.
As such, it can be considered as a motion video compression
toolkit, from which a user selects the particular features that
best suit his or her application. More specific objectives are:
1. The standard will deliver acceptable video quality at
compressed data rates between 1.0 and 1.5 Mbps.
2. It will support either symmetric or asymmetric
compress/decompress applications.
3. When compression takes it into0 account, random-access
playback is possible to any specified degree.
4. Similarly, when compression takes it into account, fast-forward,
fast-reverse, or normal-reverse playback modes can be made available
in addition to normal (forward) playback.
5. Audio/video synchronization will be maintained.
6. Catastrophic behavior in the presence of data errors should be
avoidable.
7. When it is required, compression-decompression delay can be
controlled.
8. Editability should be available when required by the application.
9. There should be sufficient format flexibility to support playing of
video in windows.
10. The processing requirements should not preclude the development
of low-cost
Some of these objectives are conflicting, and they all conflict
with the objectives of cost and quality.
In spite of that, the proposed standard provides for all of the
objectives, but of course not all at once.
A proposed application has to make its own choices about
which features of the standard it requires and accept any
tradeoff that this may cause.
Architecture The MPEG standard is primarily a bitstream specification,
although it also specifies a typical decoding process to assist in
interpreting the bitstream specification.
This approach supports data interchange, but it does
not restrict creativity and innovation in the means for
creating or decoding that bitstream.
The bitstream architecture is based on a sequence of
pictures, each of which contains the date needed to
create a single displayable image.
Note that the order of transmission of pictures in the data
stream may not be the same as the order in which pictures will
be displayed- the will be evident shortly.
There are four different kinds of pictures, depending on how
each picture is to decoded.
I pictures are intercoded, meaning that they are coded
independent of any other picture.
An I picture must exist at the start of any video stream and also
at any random-access entry point in the stream.
I pictures are predicted pictures, which are coded using motion
compensation from a previous I or P picture ,
B pictures are interpolated pictures, which are coded by
interpolating between a previous and a future I or p picture .
This process is sometimes referred to as bidirectional
prediction.
D pictures are a special format that is only used for
implementing fast search modes.
An I pictures requires the most date; it is similar to a JPEG
image. It is structured into 8X8 blocks that are DCT coded,
quantized, and statistically encoded.
A P picture requires about one-third of the data of an I picture ;
it consists of 16X16 marc blocks, which are DCT coded motion
correction values.
A B picture takes 2:1 to 5:1 less data than a P picture; it also
has marcblocks and blocks containing interpolation parameters
and DCT coded correction values.
The most compression is obtained by using as many B pictures
as possible.
However, to perform B decoding , the ‗future‘ I or B picture
involved must be transmitted before any of the dependent B
pictures can proportional to the number of B pictures in series.
Because of the delay issue, the consideration of random access,
and the effectiveness of the interpolation technique, a typical
displayed picture sequence would be of the form (the numbers
are the order of display):
I B B B P B B B I B B B P……
1 2 3 4 5 6 7 8 9 10 11 12 13
This picture sequence is diagrammed showing the decencies of
pictures on each other.
* Forward prediction *
| |
Interpolation
(bidirectional predication)
* Possible random-access entry points
Figure: A Typical MPEG picture sequence showing inter-frame dependences
The standard, however, is completely flexible with
regard to the picture sequence, and an application can(
if it wishes ) tailor it to optimize any situation.
As mentioned before, when B pictures are used, the
reference I and P pictures must be transmitted before
any dependent B pictures that the order of
transmission for the sequence above is ( the numbers
are still the order of display):
I P B B B I B B B P B B B…..
15 2 3 4 9 6 7 8 13 10 11 12
Bitstream Syntax The bitstream is structured at six levels of hierarchy in order to
support all of the features of MPEG video. These are:
Sequence layer-an independent video stream.
Group of pictures layer-- this is a clip of video that begins with
a random-access entry point and has uniform video parameters
within it.
Picture layer- represents one displayable image.
Slice layer-a variable –size sub picture group that provides for
synchronizing of the decoder in the even of an error.
Macroblock layer-the 16 X 16 pixel motion compensation unit.
The syntax levels are diagrammed.
Layer . . . . . . Video sequences
. Group of pictures
Pictures
Slice
Macroblock
CR CB
Block 8X8 pixels
.= Possible random-access entry points
Figure : MPEG layered bitstream structure
1 2 3 4 5 …… j
1 2 3 4 …… k
1 2 3 4 …… l
1 2 3 4 …… m
1 2
3 45 6
Performance MPEG provides for a wide range of video resolutions and data
rates. One set of choices that has been widely researched is
optimized for a data rates of about 1.2 Mpbs( CD_ROM data rate).
For 30 frames/second video at a display resolution of 352X240
pixels, the quality of compressed and decompressed video at this
data rate is often expressed as similar to VHS recording.
Most scenes do not exhibit compression artifacts, but the most demanding material may resolution or frame rate tradeoffs to obtain visually acceptable results.
MPEG-2 and MPEG-4
The ISO committee which developed the MPEG standard is
currently at work specifying a successor standard known as
MPEF-2.
The video component is targeted for bit rates in the range of
about 2 to 15 Mbps, which is sufficient for supporting HDTV.
Additionally, MPEF-2 includes a number of new features with
the intent of providing compatibility with existing standards
such as terrestrial video, MPEG and H.261.
Compatible transmission is conceptually similar to today‘s TV
broadcasts, which can received by both color and black and
white television sets.
The MPEG-2 encoding is intended to allow a single
transmission to be received by a range of digital televisions,
from small portable units that might only support NTSC
resolution to HDTV receivers.
Scalable digital video is critical to transmission over packet
switching networks.
As the load on the network increases, the transmitting node
adjusts by decreasing the quality of the transmitted video.
The audio encoding fir MPEG-2 is also being extended.
MPEG-2 audio will encode up to five full bandwidth channels
(left, right, center, and two surround channels), an additional
low-frequency enhancement channel, and up to seven
commentary or multilingual channels.
Several improvements on the MPEG audio format are planned
for lower sample rates.
More recently, another digital video encoding standard effort
known as MPEG-4 is underway.
This new initiative is for very low bit-rate coding of
audiovisual programs, with particular application to mobile
multimedia communications.
Although MPEG-2 and MPEG both use a DCT algorithm, it is
anticipated that MPEG-4 will be based on a new algorithm,
which , though computationally more expensive, results in
significantly higher compression.
DVI TECHNOLOGY DVI technology is different from the other standards discussed
here because it is specifically based on the use of special
hardware.
Intel Corporation and IBM corporation have developed a
programmable chipset which implements the technology in a
co –processor environment on any type of computer platform.
These chips support a wide range of multimedia functions in
software, including JPEG compression and several DVI-
unique compression algorithms of stills or motion video.
Because of the programmability of the chipset, it can respond
to new algorithm developments-for example, the programming
of the chips to do MPEG processing is being explored.
Intel is committed to producing higher-speed DVI functionality
with a future and even discussed the possible integration of
DVI functionality with a future generation of x86 CPU family.
Thus, the DVI hardware is an important engine for present and
future compression developments.
DVI TECHNOLOGY MOTION VIDEO
COMPRESSION DVI Technology can do both symmetric and asymmetric
motion video compression / decompression.
The asymmetric approach is called Production- Level Video(
PLV), Video for PLV must be sent to a central compression
facility, which uses large computers and special interface
equipment, but and DVI system is capable of playing back the
resulting compression facility, which uses large computers and
special interface equipment, but any DVI system is capable of
playing back the resulting compressed video.
The picture quality of PLV is the highest that can currently be
achieved.
The other DVI compression approach is called Real Time
Video ( RTV) .
It is done on any DVI system that has the Action Media
Capture Board installed.
Playback of RTV is on the same system or any other DVI
system.
Because RTV is a symmetric approach, which requires that
compression be done with only the computing power available
in a DVI system, RTV picture quality is not as good as PLV
picture quality.
DVI PRODUCTION –LEVEL
COMPRESSION PLV is an interframe compression technique; the algorithm
details are proprietary, and we can say only that it is block-
oriented and that it involves multiple compression techniques.
Since it was designed specifically for the DVI chipset, it is
optimized for that environment, and it probably would not
make much sense to run it on different hardware.
PLV compression is an asymmetric approach where a large
computer does the compression and the DVI hardware in the
PC does the decomposition.
It takes a facility costing several hundred thousand dollars to
perform PLV compression at reasonable speeds.
Since this cost is too much for a single application developer to
bear, centralized facilities are provided where developers can
send their video to be compressed for a fee.
High quality motion video compression has difficulties right
from the start.
The data rates created by the initial digitizing are high, even for
large computers.
This happens because the initial digitizing really has to be
done in real time to obtain the best quality.
In most cases, the input video medium for compression will be
an analog videotape-for best results , it will be one-inch
broadcast quality tape.
Although one-inch tape machines can play at slow speeds,
they do it by introducing frame storage and processing, which
would interfere with the quality of the compressed result. The
only way to get around that processing is to run the VTR at
normal play speed ( 30 frames per second) .
Therefore, for best quality, we must invest in digitizing and
interface hardware, which will let the VTR run at normal speed
and capture the digital data on computer disk.
For PLV compression, the real-time video from the VTR is
digitized, filtered, and chrominance subsampled by special
hardware before storage on digital disk.
Such storage still requires a data rate of about 2 megabytes per
second –subsequently higher than the storage data rate of a
typical PC – and a 12 gigabyte digital disk only holds 10
minutes of this partially compressed digital video.
Then, in nonreal time, the data is taken frame by frame from
the digital disk and run through the PLV compression
algorithm.
The compression algorithm typically runs on a parallel process
processor CPU.
Compression takes about 3 seconds per frame on 250-MIPS
machine –still about 90 times slower than real time. ( A minute
of final compressed video will take 90 minutes to compress).
Of course, compression speed is proportionally faster on an even
larger parallel machine.
PLV Performance The DVI PLV compression algorithm is proprietary and will
not be described here.
Its performance is also difficult to describe or show here,
because it doe not make sense to show still frames from a
motion sequence.
That is because an individual frame frames from a motion
sequence.
That is because an individual frame from motion video may
contain artifacts which are not visible when those frames are
delivered to a viewer at 30 frames per second.
There is a significant degree of visual averaging taking place
when viewing 30 frames per second video.
This is also true for normal analog television- noise
artifacts become highly visible in a single still frame,
whereas the averaging between frames in a motion
sequence makes noise much less visible.
You can observe this problem if you experiment with a
VCR which has slow –motion pictures.
Anyway, PLV compressed video delivers full-screen, full-
motion pictures.
Anyway, PLV compressed video delivers full-screen, full –
motion pictures at a quality subjectively competitively with
half—inch VCR pictures.
The PLV compression algorithm must be given goals for the
data rate of the compressed bit stream and for the amount of
DVI processor chip time per frame which will devoted to
decompression.
Even when working with CD-ROM , we will often want to use
fewer than 5120 bytes per frame for he average data rate of the
video because we want to leave space in the CD_ROM rate for
audio or possibly other data. We also may wish to display the
motion video at less than full-screen in order to save data so
that more than 72 minutes could be on one CD-ROM. Disc.
Another way to effectively reduce the compressed data rate is
to lower the video frame rate.
In some cases, 15 frames per second is fast enough.
This cane be used either to cut the video data rate in half or to
allow more than 5120 bytes per frame to achieve somewhat
higher video quality.
In the case of DVI processor decompression time, there are 33
milliseconds per frame available at 30 frames per second ( 40
milliseconds time be used for decomposition because we need
the processor to perform motion image or scaling the image to
a different size.
You can see that there is a multidimensional tradeoff here
involving four interacting parameters, which together will
determine the resulting picture quality:
Image Cropping (pixel count)
Compressed data bytes per frame
Video frame rate
Decompression processing time
The PLV compression software takes all of these parameters as
input, and it will try to produce the best –quality pictures
within these constraints.
Because PLV compression scheme, there are some special
considerations involved in starting up playback of a scene or in
starting a scene in the middle.
The first frame of motion sequence must be treated as a still
image ( called a reference frame); additional time is required to
send all the data for a reference –about three times the data of
an average motion video frame. ( If we are using motion video
at 5120 bytes average per frame, a reference frame will require
around 15,000 bytes of data.)
If it is intended that a scene wil be started by the application
those points.
This can usually be done without causing any noticeable
interruption of motion when the scene is played from end to
end, because the DVI decompression software uses multiple
frame buffers in VRAM so that variations in the put
compressed data rate are accommodated with affecting the
displayed frame rate .
In any case, if you have special needs for reference frame in
your video, they have to be expressed at the time you order
PLV compression.
DVIREAL-TIME COMPRESSION The use of a centralized compression service that is remote
from the developer of an application introduces delay and
expense into the application introduces delay and expense into
the application development process.
It also precludes any application that needs to do real time
compression.
In creating an application, a developer needs a way to
experiment with the video and audio in the context of the
application without incurring this delay and expense.
This need is filled by DVI‘s service Real-Time Video (RTV),
which is a compression process that is done in real time on
DVI development system.
With RTV, the developer may compress his or her video and
audio to the same file size as from the PLV service, and then
use those files in the application under development inexactly
the same way that the final PLV file will be used.
By this means the developer may experiment as much as
needed and actually try out the complete application before
sending any video out for PLV compression.
The tradeoff in RTV is picture quality.
To accomplish motion video compression with only the
resources of DVI development systems means that RTV is
lower in resolution and frame than PLV.
Compression is done with DVI processor chips and , while
these chips are very powerful among their peers, in the
milliseconds available to compress a frame in real time the
processor doe not compare with the computer cycle available in
simplified.
However, the results produced are good enough to fill most
needs for application development and testing.
For some applications which do require real-time compression
within the application, RTV may completely fill the bill.
RTV compression allows the user to make some trades of
compression versus picture quality if the RTV-Compressed
code will never have to be stored on CD-ROM .
By allowing the data rate to go higher than 1,53,600 bytes per
second using fast hard disk storage, the RTV frame rate can be
increased to 30 frames per second.
Most DVI capture software provides for user choice of these
parameters.
Communication between RTV and PLV occurs through the
medium of SMPTE time code.
The original one-inch videotape which will eventually go to PLV
compression by RTV for development purposes, the time code is captured,
for storage with the video frames.
RTV is not frame –to-frame compression, so an TRV file can be started or
stopped at any point frame-to-frame compression, so an RTV file can be
started or stopped at an point.
In the RTV mode decisions about in and out cut points for the displayed
video can be made and the time code values may be read from the RTV-file
data to create the edit list which will be used for the PLV compression.
After all decisions about video material have been made, the master one-
inch tapes and the edit list go to the PLV compression facility for final
compression of exactly and only the selected scenes
MIDI PROTOCOL In the early 1980‘s, several music instrument manufacturers
agreed on a networking standard for musical instrument Digital
Interface.
The standard is now maintained by MMA, the MIDI
Manufacturer‘s Association and disseminated by the
International MIDI Association ( IMA) [67].
The specification is also reproduced in whole or in part in
references such as [65,68,69].
The specification calls for certain hardware connections, using
a 5-pin DIN connector .
There are three kinds of connections, allowed; in , out and
‗thru‘.
A thru connector provides a direct copy of the input signal.
I would like to mention in passing that the MIDI network,
although it has been made to work, it is to be expected that
some super-set of MIDI will appear on the market.
Already companies like Lone would have attempted to bring to
the market an optical network which includes MIDI as a subset.
The MIDI software specification involves 8 data bits , a start
bit, and a stop bit, for a total of 10 bits transmitted at a rate of
31.25 kbaud.
A message consists of one status byte followed by zero or more
data byes.
MIDI devices, such as tone generators, can be connected in
networks, such as chain or trees.
Each device can listen to one or more MIDI channels.
All data and mode messages are sent to all receivers but the
messages include a channel number so that on specific
messages .
The messages defined for musical events, such as note on, note
off, and pitch bend change.
The key number represents keys from the bottom of the
keyboard range to the top.
Velocity means the speed with which the is struck and
generally controls attacks characteristics , overall amplitude,
and spectrum of the note.
The polyphonic key pressure message is sent by devices such
as keyboards that can measure the pressure applied as each ley
is held.
The pressure for each key can be sent separately so that
individual notes can be modified in performance.
A channel pressure message comes from a device that can
measure the pressure from its sensors, but can send only one
pressure detected ?(usually the maximum ).
A program change message causes the synthesizer to select
one of 128 voices .
In the early years of MIDI, each manufacturer assigned
arbitrary voices to those program numbers.
The recent General MIDI Specification includes a 128-voice
Instrument pitch Map.
A melody recorded on one General MIDI on one General MIDI
synthesizer's xylophone sound, for example, will also be played
back using a xylophone, and on some other General MIDI
synthesizer.
Four Mode message (not shown in the table) determine, among
other things, whether the instruments ‗voices will be assigned
to incoming notes in monophonic (single melody) or
polyphonic(several voices a once) fashion.
There is also provision for common messages( sent to all
receivers) real-time messages(for synchronization), and for
system exclusive (sysex) messages.
System Exclusive is Essentially a generalized escape
mechanism for messages of arbitrary length.
MIDI is not limited to hardware systems , Indeed, the
acceptance of MIDI made possible the proliferation of software
programs running on the Amiga, Macintosh, Atari, and PC.
MIDI software includes sequencer programs, with which the
musician can record, play back, view , and alter musical events,
working with music notation, piano-roll notation, text displays
of MIDI commands, and the like.
The basic MIDI messages playing back the melody from a
synthesizer,. .
In the fig all the messages are sent out over channel 0.
The note numbers and velocities are given in decimal
representation.
A note on message with a velocity of ‘0‘ is the same as a note-
off message.
Time in the first column is in milliseconds, with 90 quarter to
the second.
Note that the first note occurs after a 3-second delay from the
start of the play back.
The original MIDI specification dealt primarily with real-time
music performance.
To represent time in music, there are basically two possibilities
–absolute time and delta time.
With delta time, the time elapsed since the previous
event is recorded.
With absolute time, time elapsed since the beginning
of the composition is represented.
In the most general terms, both kinds of time are identical.
But in practical implementation , delta time has the advantage that a whole sequence can be moved as one unit; only the start time of the unit must be changed.
The polyphonic key pressure message is sent by devices such
as keyboards that can measure the pressure applied as each key
is held.
A program change message causes the synthesizer to select one
of 128 voices.
BFIEF SURVERY OF SPEECH RECOGNITION AND
GENERATION
Speech is one of the main channels for human communication
and thus must be handled carefully in any multimedia
communications system, in contrast to what has been discussed
thus far about music, a major criterion in speech in
intelligibility.
Telephone –quality spoken has a bandwidth limited so around
200-3400 Hz. An 8-kHz sample rate results in 68kh/sec bit rate
for PCM speech, far smaller than required for music PCM.
Speech Production :
The organs involved in speech include the larynx, which
encloses loose flaps of muscle called vocal cords.
The puffs of air that are released create a waveform which can
be approximated by a series of rounded pulses.
The waveform created by the vocal cords propagates through a
series of irregularly shaped tubes, including the throat, the
mouth , and the nasal passages.
At the lips and other points in the tract, part of waveform is
transmitted further, and part is reflected.
The flow can be significantly constricted or completely
interrupted by the uvula, the teeth, and the lips
A voice sound occurs when the vocal cords produce a more or
less or less regular waveform.
The less periodic, unvoiced sounds involve turbulence in
which some part of the whole tract is tightened.
Vowels are voiced sounds produced without any major
obstruction in the vocal cavity.
In speech, formants (introduced) are created by the position of
tongue and jaw. , for example , In separating vowels, the first
three formants are the most significant.
In the fundamental of the female is around 200HZ and higher,
with the formants perhaps 10 percent higher than those of the
male.
Consonants arise when the vocal tract is more or less
obstructed.
Sounds at the level of constants and vowels are
collectively known as phonemes, the most basic unit of
speech differention, analysis, and synthesis, then the word.
Fig shows a sonogram , a time –varying representation of
a speech signal.
The regions of high energy appear dark.
The vertical stripes in the dark region correspond to
individual pulses from the vocal cords.
The change in the position of the darkest areas from left to
right corresponds to the changes in formants.
The SPASM system developed by cook (76) combines models
of the glottal waveform and noise sources in the vocal tract
with modeling of the tubes and obstructions in the vocal tract.
The resulting articulatory model is implemented with a GUI,
including cross-section of the head , to permit synthesis of
spoken and singing voice.
Encoding and Transmitting speech
The simplest way to encode speech is to use PCM , discussed
above.
The 8-bit , 8 kHz standard for speech is of significantly lower
quality than what is required for music.
Still, at the nominal 64-kb/sec rate for speech, if one bit per
sample can be saved, then the total saving is 5 kb/sec.
Methods for lowering the bit rate thus remain an active area of
research.
The ADPCM method discussed above can easily save 2 to
47bits per sample.
―PCM, ADPCM, and related methods attempt to describe the
waveform itself.
There are other methods, such as the sub band coding
discussed above under MPEG.
We now turn to another class of methods, called voice codes or
vocoders.
The human vocal tract can be simplified by assuming, for
example , the source of vibration for voiced sounds is not
affected by the rest of the vocal tract.
The series of filters that model the vocal tract can be modeled
such that if one filter changes, there is no effect on the others.
Under such conditions , we can calculate the voice model
coefficients independtly of the fundamental frequency or the
voiced or unvoiced decision.
We can also reasonably assume that formants change quite
slowly compared to the rate of individual pulses from the
vocal tract and transmit the filter co- efficients at a slower
rate.
The channel vocoder pioneered by Dudley analyses speech
as a bank of filters.
The driving function for synthesis is noise or a series of
pulses like those generated by the vocal cords.
.
The filter co-efficients, the fundamental frequency, and the
voiced/unvoiced decision are transmitted.
Research on the channel vocoder ultimately led to the phase
vocoder implementation mentioned above.
Linear prediction, also mentioned above, models the vocal tract
as a source followed by a series of filters.
These filters can be modeled as a series of tubes, and tube
parameter can be transmitted
There is unfortunately, no intuitive relationship between tube
parameters and say the spectrogram reprepesentation , but LPC
is certainly adequate for compressing the speech for
reproduction in chips.
One transmission pitch period, gain , the voice/un voiced
decision, and a dozen or so filter coefficients .
In different kind of system , both encoder and decoder can
contain lookup table.
Each table entry is vector containing a series of samples.
Rather than transmit the samples, one can tranmit just the index
into the table.
If the exact sequence of samples cannot be found, the closest
vector is transmitted.
This method can be used to transmit the waveform itself or
sequences of coefficients for a vocoder .
As we have seen, the basic data is 64 kb/sec (CCITTg.211) for
8 bit PCM with ADPCM, 4 to 6 bits per sample are transmitted,
for 32 to 48 kb/sec; there is 32 lb/sec CCITT standard G721 for
ADPCM. Some subband coding systems operate as lower as 16
kb/sec.
For higher –quality speech with sub band coding, there is
CCITT G.72 for 50-7000Hzat 64 –kb sec rate.
For various methods of coding bit rates cab fall as low as 2400-
bit/s, but with a corresponding reduction in quality.
There is a good discussion of the various CCITT standards.
Improvements in quality and lowering bit rate are being driven
by military research and the usual telephone companies, but
also by factors such as the desire to incorporate voice with
other data, such as in ISDN, or the need to scrunch more
channels from cellular networks.
End of Unit-II
top related