Wednesday, July 4, 2007

Handset Sound Capabilities

Handsets need audio coders and decoders for a variety of reasons, the most fundamental
being the encoding and decoding of human speech for telephony services.
However, with a typical handset now supporting multimedia functions, the sound
capabilities will support music, the audio that accompanies a video clip, and applications
such as games and ringtones.

Ringtones
Ringtones have become big business as users attempt to differentiate their handset
models from all identical models. The ringtone represents an easy route to phone
personalization. There are numerous formats (Figure 4.28) in which to provide
ringtones, and this represents the dramatic changes in capability from the very
early phones, which could only support monophonic tones, to the devices of today,
which are able to support complex music files coded with the same techniques used
to record CDs.

Some of the ringtone formats that have appeared in the marketplace are proprietary
and are perhaps only supported by a limited range of models from one
supplier or a small number of suppliers.

Ringtone Formats
Many early handsets had very limited capability in terms of ringtone support —
the tone output was monophonic as the sound elements could only play one note at
any moment in time. The manufacturer would maybe supply a handful of built-in
ringtones for the user to choose from, and there was no capability for downloading
new tones. Monophonic tones have a very artificial sound to them.

Increased capability brought polyphonic ringtones to handsets and, combined
with features and services that allowed users to download new tones on their
phones, the market for ringtones was created. There are a number of polyphonic
ringtone formats (see Table 4.2), including the Musical Instrument Digital Interface
(MIDI). MIDI differs from the other tone formats in that the MIDI file does
not actually contain coded music, but rather a set of instructions about the notes
to be played, the voice to be used for each note, and the duration and depth of each
note. The consequence is that MIDI files are very compact and therefore ideally
suited to ringtone downloads.

An improvement on MIDI is Scalable Polyphonic MIDI, which allows the same
content to be played on devices that differ in terms of their polyphonic capability.

A low-end phone might only have four-note polyphony while a high-end phone
might have 32-note polyphony; the same file could play on both handsets through
a process of scaling.

The eXtensible Music Format (XMF) was introduced to overcome the limitations
of the 128 fixed instrument pallet of MIDI. XMF allows downloading of new
sounds to replace the default MIDI sounds.
More recently, ringtones have been provided as MP3 files using the same audio
coding techniques used for music distribution. These files offer much more realistic
sound capabilities, and the availability of chart music as ringtones is evidence of the
popularity of this format.

There are a number of manufacturer-specific ringtone formats in use and also
formats devised by third parties; for example, the polyphonic Synthetic music
Mobile Application Format (SMAF) from Yamaha is supported on a range of
phones from different manufacturers.

Audio Coding
As with ringtones, handsets must be capable of supporting a wide range of audio
formats if an end user wants to decode audio from a variety of sources. There are
two distinct families of audio coder found in handsets. The first family is related
to the need to code the human voice for telephony services, although some of the
coders used are derivatives that support signals with a wider bandwidth than speech
(e.g., music). The second family of coders consists of those that comprise the audio
layer used in video coding techniques.

GSM handsets were originally built around the Full-Rate (FR) codec, which
was later supplemented by the Half-Rate (HR) codec, the Enhanced Full-Rate
(EFR) codec, and the Enhanced Half-Rate (EHR) codec (Figure 4.29). All these
coding mechanisms are built around a model of the human voice and, therefore,
while they offer good quality for speech, they are not optimized for non-speech
signals such as music.

The GSM specifications moved on to an Adaptive Multi-Rate (AMR) codec
that was also adopted as the standard by 3GPP for UMTS networks. This codec
could switch rates according to needs and conditions, but was still speech oriented.
However, recent improvements have been made to the codec, first by improving
quality, and second by extending the audio bandwidth and adding stereo capability.
Thus, the codec has evolved to support not only voice, but also high-quality
audio, including stereo music.

The second family of codecs found in handsets is based on the Advanced Audio
Codec (AAC) taken from the MPEG specifications. As with the AMR codec, the
AAC codec has evolved to improve quality and support stereo signals.

No comments: