INTERNATIONAL
ORGANISATION FOR STANDARDISATION
ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC1/SC29/WG11
CODING OF MOVING PICTURES AND AUDIO
ISO/IEC JTC1/SC29/WG11 N0120
MPEG91/346
Kurihama, November 1991
Source:
|
Leonardo
Chiariglione - Convenor |
Title:
|
MPEG Press Release |
Status: |
Adopted
at 16th WG11 meeting |
MPEG Press Release
The
ISO/IEC WG11 of JTC1/SC2 (MPEG) met in Kurihama, Japan, in the period of
November 18th to 26th. In the first week of the meeting a Committee Draft (CD)
of the MPEG standard was finalised. The techniques developed by the MPEG
Committee will enable many applications requiring digitally compressed video and
sound. The storage media targeted by MPEG include CD-ROM, DAT, and computer
disks and it is expected that MPEG-based technologies will eventually be used in
a variety of communication channels such as ISDN and local area networks and
even in broadcasting applications.
At
the rate of 1.2 Mbits per second, good quality pictures have been demonstrated
at 24, 25 and 30 frames per second, and at a spatial resolution of 360 samples
per line. This resolution is consistent with the resolution of consumer grade
television. To code stereo sound of Compact Disc quality, a rate of
appproximately 0.2 Mbits per second is required, resulting in a total rate of
1.4 Mbits per second. Such a rate could permit numerous applications including
video and associated audio on Compact Disc.
The
Committee Draft "Coding of moving pictures and associated audio for digital
storage media at up to about 1.5 Mbits/s" consists of three parts: System,
Video and Audio. The System part (11172-1) deals with synchronisation and
multiplexing of audio-visual information, while the Video (11172-2) and Audio
part (11172-3) address the video and the audio compression techniques
respectively.
System
The
MPEG-System committee completed and approved for release the technical
specification for combining a plurality of coded audio and video streams into a
single data stream. The specification provides fully synchronised audio and
video and facilitates the storage in and the possible further transmission of
the combined information through a variety of digital media.
This
"systems coding" includes necessary and sufficient information in the
bit stream to provide the system-level functions of synchronisation of decoded
audio and video, initial and continuous management of coded data buffers to
prevent overflow and underflow, random access start-up, and absolute time
identification. The coding layer specifies a multiplex data format that allows
multiplexing of multiple simultaneous audio and video streams as well as
privately defined data streams.
The
basic principle of MPEG System coding is the use of time stamps which specify
the decoding and display time of audio and video and the time of reception of
the multiplexed coded data at a decoder, all in terms of a single 90kHz system
clock. This method allows a great deal of flexibility in such areas as decoder
design, the number of streams, multiplex packet lengths, video picture rates,
audio sample rates, coded data rates, digital storage medium or network
performance. It also provides flexibility in selecting which entity is the
master time base, while guaranteeing that synchronisation and buffer management
are maintained. Variable data rate operation is supported. A reference model of
a decoder system is specified which provides limits for the ranges of parameters
available to encoders and provides requirements for decoders.
Some
optional sets of constraints provide a framework for common industry acceptance
of certain key parameters for use by decoder designers and information providers.
While the MPEG Systems specification is included in the current work item of
MPEG, it is designed for compatibility with future extensions to audio, video
and hypermedia coding and a wide variety of bitrates.
Video
Dozens
of algorithmic approaches were carefully reviewed over a period of 3 years
refining and perfecting this video compression algorithm. While the MPEG
compression algorithm is optimised for bitrates of about 1.5 Mbit/s, it can
perform very effectively over a wide range of bitrates and picture resolutions.
The video standard does not recommend a particular way of encoding pictures and
much flexibility is given to implementers of the standard to use the MPEG syntax
to optimise the visual quality and access options. The color resolution is given
particularly high attention so as to support computer applications, games and
animation.
The
compression techniques developed in MPEG rely on the discrete cosine transform (DCT)
for spatial redundancy reduction and motion compensated inter-frame coding to
take into account the high temporal correlation of video signals by using
information from both the past and the future. The statistics of the resulting
information also can be exploited to further reduce the bitrate through the use
of special codes known as Huffman codes. While the discrete cosine transform has
been widely used for many years, the techniques developed within MPEG also
exploit the characteristics of the human visual system to optimise the perceived
image quality. Eventual coding impairments are concentrated in frequencies and
regions of the picture where they are perceptually minimal.
Audio
The
audio coding experts of MPEG finalised an audio coding algorithm after having
reviewed and tested many approaches over the last three years. The resulting
digital audio bitrate reduction technique supports several bitrates covering a
range from intermediate to compact disc quality. This latter quality can be
obtained at a total bitrate of 256 kbit/s for a stereophonic program.
Depending
on the applications, three layers of the coding system with increasing
complexity and performance can be used. In all three layers the time domain
input audio signal is converted into a frequency representation. In Layers I and
II a filterbank creates 32 subband representations of the input audio stream
which are then quantised and coded under the control of a psychoacoustic model
from which a blockwise adaptive bit allocation is derived. With respect to Layer
I , Layer II introduces further compression by redundancy and irrelevance
removal on the scalefactors and more precise quantisation. In Layer III
additional frequency resolution is provided by the use of a hybrid filterbank.
Every subband is thereby further split into higher resolution frequency lines by
a linear transform that operates on 18 subband samples in each subband. The
frequency lines are again quantised and coded under the control of a
psychoacoustic model. In Layer III, nonuniform quantisation, adaptive
segmentation and entropy coding of the quantised values are employed for a
better coding efficiency.
The
range of bitrates (total rate for both channels) provided by the standard is
between 64 kbit/s and 448 kbit/s. The standard also supports coding starting at
32 kbit/s for a single channel. In all layers a joint stereo mode that exploits
stereophonic irrelevance or stereophonic redundancy can be used as an option to
improve the subjective quality.
Next
Phase of the MPEG Standard
The
succes of the initial phase of work will let the experts of MPEG focus on
developing a generic standard for the compression of higher resolution video
signals to be used for storage as well as for communication applications with
bitrates in the range of 5 to 10 Mbits/s. This work
is being conducted in close collaboration with the CCITT Experts Group for ATM (Asynchronous
Transfer Mode) video coding.
In
the same week, 32 different proposals of video coding targeted at those bitrates
were tested. The test consisted of a subjective quality assessment and an
analysis of the implementation complexity of each proposal. The results of the
test show that distribution quality video is achievable at those rates with a
reasonable implementation cost. The proposing organisations represent a wide
sample of companies in consumer electronics, broadcasting, computers and
telecommunications as well as universities.
After analysing the results of the test, the experts are beginning a phase of collaborative algorithm improvements with the goal of achieving a unified solution during 1992.