GL Communications Inc.
 
 
 
Home >  VoIP Analysis and Simulation

Voice Codecs

GL Communications products support a variety of signaling and audio processing applications in both VoIP and TDM. Using these tools, one can emulate, analyze, and troubleshoot audio signaling over both VoIP and TDM. Each of these tools  support the following narrow-band, and wideband (HD audio) codec standards:

Codec Data Rate Sampling Rate VAD Supported Packetisation
Time (Ptime)
G.711 (PCM -law/A-law) 64 kbps 8000 No Multiples of 10 ms
G.711 App II
(PCM -law/A-law with VAD)
64 kbps 8000 Yes Multiples of 10 ms
G.722 64 kbps 16000 No Multiples of 10 ms
G.722.1 (Wideband) 24 kbps
32 kbps
16000 No Multiples of 10 ms
G.729 8 kbps 8000 No Multiples of 10 ms
G.729B 8 kbps 8000 Yes Multiples of 10 ms
GSM 6.10 FR 13.2 kbps 8000 No Multiples of 20 ms
GSM EFR 12.2 kbps 8000 Yes Fixed at 20 ms. Multiple Ptime Not Supported
GSM HR 5.6 kbps 8000 Yes Multiples of 20 ms
G.726 5 bit 40 kbps
4 bit 32 kbps
3 bit 24 kbps
2 bit 16 kbps
8000 No Multiples of 10 ms
G.726 (with VAD) 5 bit 40 kbps
4 bit 32 kbps
3 bit 24 kbps
2 bit 16 kbps
8000 Yes Multiples of 10 ms
AMR
(requires additional license)
4.75 kbps
5.15 kbps
5.9 kbps
6.7 kbps
7.4 kbps
7.95 kbps
10.2 kbps
12.2 kbps
8000 Yes Multiples of 20 ms
AMR WB (Wideband)
(requires additional license)
6.60 kbps
8.85 kbps
12.65 kbps
14.25 kbps
15.85 kbps
18.25 kbps
19.85 kbps
23.05 kbps
23.85 kbps
16000 Yes Multiples of 20 ms
EVRC, EVRC0
(requires additional license)
EVRC Rates - 1/8, and 1 8000 No Multiples of 20 ms
EVRCB , EVRCB0
(requires additional license)
EVRCB Rates - 1/8, , and 1 8000 Yes Multiples of 20 ms
EVRC_C (Wideband)  
(requires additional license)
16000 Yes Multiples of 20 ms
SMV
(requires additional license)
Modes - 0, 1, 2 and 3 8000 No Multiples of 20 ms
iLBC 15.2 kbps
13.33 kbps
8000 No Multiples of 20 ms
Multiples of 30 ms
SPEEX (Narrow Band) 8 kbps 8000 Yes Fixed at 20 ms. Multiple Ptime Not Supported
SPEEX (Wideband) 11.2 kbps 16000 Yes Fixed at 20 ms. Multiple Ptime Not Supported
  • µ-Law, A-law (G.711)
    PCM has been the standard for digital voice transmission in telephony since 1972. It captures speech in a range of 300 to 3.4 kHz, samples at 8000 samples/second with 8 bits per sample resulting in 64 kbps with an encoding frame length of 10 ms. The two algorithms defined in the standard are µ-Law (North America & Japan) and A-law (used in Europe and the rest of the world). Both are logarithmic, but A-law was specifically designed to be simpler for a computer to process.

  • G.722, G.722.1
    G.722[1] is a ITU-T 16 kHz (with 14 bits per sample) wideband speech codec standard operating at 48, 56 and 64 kbps with an encoding frame length of 10 ms. Technology of the codec is based on sub-band ADPCM (SB-ADPCM). G.722 sample audio data at a rate of 16 kHz (using 14 bits) with an encoding frame length of 10 ms, double that of traditional telephony interfaces, which results in superior audio quality and clarity.

  • G.729, G.729B
    G.729 operates at a bit rate of 8 kbps with an encoding frame length of 10 ms and 5 ms look ahead, but there are extensions, commonly designated as G.729a and G.729b. Annex A and Annex B Voice encoding using CS-ACELP (Conjugate-Structure Algebraic Code Excited Linear Prediction) 8 kbps, is the lowest bit rate ITU-T standard with toll quality. Annex A is a low-complexity version of the G.729 standard. Annex B defines VAD/CNG/DTX (Voice Activity Detection/Comfort Noise Generator/Discontinuous Transmission) for G.729 and G.729A.

  • GSM-FR
    GSM-FR is a Full Rate speech coder standardized by the European Telecommunications Standards Institute (ETSI) for compressing toll quality speech (8000 samples / second) and was the first digital speech coding standard used in GSM digital mobile phone systems. The coder has a bit rate of 13 kbps with an encoding frame length of 20 ms.

    This coder uses the principle of Regular Pulse Excitation-Long Term Prediction-Linear Predictive coding. The coder works on a frame of 160 speech samples with an encoding frame length of 20 ms, and no look ahead is required.

  • GSM EFR
    GSM-EFR (6.60) is an improved and hence the Extended version of GSM-FR (6.10) codec. With sampling frequency of 8000 samples/sec and frame size of 31 bytes it achieves the bit rate of 12.2kbps with an encoding fixed frame length of 20 ms. Codec supports Voice Activity Detection (VAD) to allow saving of bandwidth.

  • GSM HR
    GSM HR 6.20 operates with sampling frequency of 8000 samples/sec. This codec outputs the frames of size 14 Bytes, which puts the bit rate of encoder at 5.6kbps with an encoding frame length of 20 ms. Codec supports Voice Activity Detection (VAD) to allow saving of bandwidth.

  • G.726 (ADPCM)
    This is an ADPCM (Adaptive Differential Pulse Code Modulation). Originally, a half-rate alternative to ITU-T G.711 and includes both the G.721 and G.723 standards. G.726 compresses by converting between linear, A-law (used in Europe) or µ-Law (used in the U.S and Japan) PCM and 40, 32, 24 or 16 kbps with an encoding frame length of 10 ms.

  • G.726 with Voice Activity Detection (ADPCM)
    This is an ITU-T Adaptive differential pulse code modulation (ADPCM) voice codec, which transmits at bit rates of 16, 24, 32, and 40 kbps with an encoding frame length of 10 ms. It supports Voice Activity detection and generates SID packets during Silence Period. ADPCM provides the following functionality: 
    • Voice mail recording and playback, which is a requirement for Internet voice mail.
    • Voice transport for cellular, wireless, and cable markets.
    • High voice quality voice transport at 32 kbps.

  • AMR NB and WB (requires additional license)
    AMR is the 3GPP mandatory standard codec for narrowband speech and multimedia messaging services over GSM and evolved GSM (WCDMA, GPRS and EDGE) networks. Designed to provide transcoder free connectivity between GSM, US-TDMA and Personal Digital Cellular networks (Japan).

    AMR operates at eight bit rates in the range of 4.75 to 12.2 kbps with an encoding frame length of 20 ms and was specifically designed to improve link robustness.

    AMR-WB provides improved speech quality because of a wider speech bandwidth that is of 50–7000 Hz compared to narrowband speech coders which in general are optimized for POTS wireline quality of 300–3400 Hz.
  • Enhanced Variable Rate Codec (EVRC), EVRC-B, EVRC-C
    For EVRC codec type three rates are provided (1/8, ½ and 1 with an encoding frame length of 20 ms). Default 1/8 and 1 are selected as the minimum rate & maximum rate. Minimum rate should be less than or equal to maximum rate. There is option to select RTP packet format between Header Free Format and Bundled Format. By default, Bundled Format is set.

    EVRC-B is an enhancement to EVRC. EVRCB codec type compresses each 20 milliseconds of 8000 Hz, 16-bit sampled speech input into output frames of one of the four rates:(1/8- 16 bits, ¼- 40 bits, ½- 80 bits, and 1- 171 bits with an encoding frame length of 20 ms). By default, 1/8 and 1 are selected as the minimum rate & maximum rate. There is option to select RTP packet format between Header Free Format and Bundled Format. By default, Bundled Format is set.

    Important enhancement in EVRC-B is the use of 1/4 rate frames that were not used in EVRC. This provides lower average data rates (ADRs) compared to EVRC, for a given voice quality.

    EVRC-C adds the feature of encoding wideband signals sampled at 16 kHz with signal bandwidth up to 7 kHz.
  • Enhanced Voice Services Codec (EVS)

    EVS provides vastly improved voice quality, network capacity and advanced features for voice services over LTE and other radio access technologies standardized by 3GPP. It is the first 3GPP conversational codec providing up to 20 kHz audio bandwidth, offering speech quality that of highest standard. 

    EVS codec includes a multi-rate audio codec, a source controlled variable bit-rate (SC-VBR) scheme, a VAD, a comfort noise generation (CNG) system, and an error concealment (EC) mechanism to offset the effects of transmission errors resulting in lost packets.  Its channel-aware mode feature further improves frame/packet error resilience.
  • SMV
    The Selectable Mode Vocoder (SMV) [2] compresses each 20 milliseconds of 8000 Hz, 16-bit sampled speech input into output frames of one of the four different sizes: Rate 1 (171 bits), Rate 1/2 (80 bits), Rate 1/4 (40 bits), or Rate 1/8 (16 bits) with an encoding frame length of 20 ms. SMV is the preferred speech codec standard for CDMA2000, and will be deployed in third generation handsets.

  • SPEEX NB and WB
    SPEEX NB is based on CELP Narrowband (8 kHz with an encoding fixed frame length of 20 ms), open source codec specifically used for VoIP and file-based applications

    SPEEX WB Codec has a sampling rate of 16000 samples/sec with an encoding fixed frame length of 20 ms, which makes it a wide band codec. This codec supports different codec options such as Sampling Rate, Variable Bit Rate, Voice Activity Detection and Perceptional Enhancement.

  • iLBC Codec
    iLBC (internet Low Bitrate Codec) is a narrow band speech codec that operates at either 13.33 kbps with an encoding frame length of 30 ms or 15.20 kbps with an encoding length of 20 ms. Companies that are using iLBC in their commercial products include:
    • Applications/Soft phones: Skype, Nortel, Webex, Hotsip, Marratech, Gatelinx, K-Phone, XTen;
    • IP Phones: WorldGate, Grandstream, Pingtel;
    • Chip: Audiocodes, TI Telogy, LeadTek, Mindspeed.

    The 13.33 kbps rate 30ms frame encodes packets of 399 bits, (50 bytes) and is designated in RTP Toolbox as iLBC_13_33. 

    The 15.2 kbps 20 ms frame creates packets of 303 bits, (38 bytes). This is labeled iLBC in RTP Toolbox. The basic quality is higher than G.729A.

Some Definitions:

Codec Bit Rate (Kbps) Number of bits per second which needs to be transmitted to deliver a voice call. (codec bit rate = codec sample size / codec sample interval).
We can calculate bit-rate as follows: For G.711 – 64 kbps = (160 bytes * 8 bits) * (1/20 ms)
For G.729 – 8 kbps = (20 bytes * 8 bits) * (1/20 ms)
Voice Payload in Bytes The voice payload size represents the number of bytes (or bits) that are filled into a packet.
PPS PPS represents the number of packets that need to be transmitted every second in order to deliver the codec bit rate. To retrieve the PPS you can just do 1/(voice payload in ms). For Example, 50 PPS=(1/20 ms), 33 PPS = (1/33 ms)
R-Factor Quality score based on various end point and network parameters. Includes codecs, packet loss, and delay.
Conversational
R-Factor
The voice quality metric that measures voice quality based on transmission delay, burst packet loss, and burst loss recency.
Listening R-Factor The voice quality metric based only on burst packet loss and codec selection.
MOS-LQ Mean Opinion Score based on listening quality. Does not consider recency or delay. ITU-T P.862 Listening Quality implementations.
MOS-CQ Mean Opinion Score based on conversational quality. Includes recency and delay effects.
MOS-PQ ITU-T P.862 normalized raw quality score.
MOS-Nom Nominal quality or maximum score for the codec selected. Similar to the G.107 E-model defaults
Recency A time factor used to weight scores based on the time from a burst packet loss to the end of the call or next packet loss event.

Estimating Speech Quality

The mean opinion score (MOS) is a commonly used method to determine the quality of speech. Every codec has a certain speech quality characteristics. With MOS, the quality of a speech is rated on a scale of 1 (bad) to 5 (excellent).

Recently. the speech quality estimates are based on the ITU G.107 E Model. These models considered the entire Ear-Mouth path and all relevant conditions such as end-to-end level, echo, side tone, and frequency characteristics of the various path segments.

The E Model uses a computational method that includes factors such as noise, signal level, loudness ratings, impairments, delay, codec type, and even network type to derive a quality score. This transmission quality rating is called as the ‘R’ factor. Over time and based on experience with subjective and objective measurements, the E Model's R-Factor score was mapped to an equivalent Mean Opinion Score (Excellent to Bad) to predict the quality of the “mouth to ear” (M2E) speech path. Scoring includes consideration for the type of subjective test used for scoring. Passive/listening or active/conversational tests produce slightly different scores.

For IP networks, the score assumes ideal conditions outside the IP cloud and bases the scores on the relevant IP impairments such as packet loss, latency, jitter, and even when these impairments occur over the duration of the call.

Codec Name Nominal  MOS Nominal R factor
G.711 µ-Law (64 kbps) 4.2 93
G.711 A-law (64 kbps) 4.2 93
G.722 (64 kbps) 3.91 96
G.722.1 (32kbps) 4.09 102
G.722.1 (24 kbps) 3.98 98
G.729A/G.729AB (8 kbps) 3.91 82
GSM-FR 3.57 73
GSM HR 3.53 72
GSM EFR 4.16 91
G.726-40k 4.16 91
G.726-32k 4.04 86
G.726-24k 3.35 68
G.726-16k 2.82 57
G.726-40k with VAD 4.16 91
G.726-32k with VAD 4.04 86
G.726-24k with VAD 3.35 68
G.726-16k with VAD 2.82 57
AMR NB Mode 0 (4.75k)
(requires additional license)
3.65 75
AMR NB Mode 1 (5.15) 3.73 77
AMR NB Mode 2 (5.9) 3.84 80
AMR NB Mode 3 (6.7) 3.95 83
AMR NB Mode 4 (7.4) 3.98 84
AMR NB Mode 5 (7.95) 4.04 86
AMR NB Mode 6 (10.2) 4.11 89
AMR NB Mode 7 (12.2) 4.16 91
AMR WB Mode 0 (6.6k)
(requires additional license)
3.39 81
AMR WB Mode 1 (8.85) 3.81 92
AMR WB Mode 2 (12.65) 4.04 100
AMR WB Mode 3 (14.25) 4.09 103
AMR WB Mode 4 (15.85) 4.11 104
AMR WB Mode 5 (18.25) 4.14 105
AMR WB Mode 6 (19.85) 4.18 107
AMR WB Mode 7 (23.05) 4.18 107
AMR WB Mode 8 (23.85) 4.18 107
EVRC
(requires additional license)
3.94 83
EVRCB
(requires additional license)
3.98 84
SMV 3.88 81
Speex WB (11.2 kbps) 4.16 106
Speex NB 4.16 91
iLBC 13.3k 3.88 81
iLBC 15.2k 3.95 83

GL’s Applications supporting Speech Codecs

Capture and Analysis Products Simulation Products
ISDN Triggered  Call Capture & Analysis Application (over TDM) RTP ToolBox™
ISDN PRI Analyzer PacketGen™
Multi-Channel Audio Bridge MAPS™ MGCP, MAPS™ MEGACO
Spectral Analysis  and Oscilloscope Display MAPS™ SIP, MAPS™ SIP I
Voice Band Analyzer (VBA) MAPS™ SIGTRAN, MAPS™ ISDN SIGTRAN
PacketScan™ (over IP) MAPS™ BICC IP, MAPS™ CAMEL IP
VQT Analysis for NB, WB, and SWB speech VQuad™ SIP

Buyer's Guide:

Item No. Item Description
PCD103

Optional Codec – AMR – Narrowband (requires additional license)

PCD104

Optional Codec - EVRC (requires additional license)

PCD105

Optional Codec – EVRC-B (requires additional license)

PCD106

Optional Codec – EVRC-C (requires additional license)

PCD107

Optional Codec – AMR - Wideband (requires additional license)

PCD108 Optional Codec  - EVS (requires additional license)
  Related Software
PKV100 PacketScan™ (Online and Offline)
PKV120 PacketScan™ HD – includes PKV100 – Online (not Offline) for temporary audio codec support
PKB100 RTP ToolBox™ Application
PKS100 PacketGen™ with PacketScan™
PKS101 SIP Core (additional)
PKS102 RTP Soft Core for RTP Traffic Generation (additional)
PKS103 RTP IuUP Softcore

PKS107

RTP EUROCAE ED137

PKS106 RTP Video Traffic Generation
PKS108

RTP Voice Quality Measurements

PKS120 MAPS™ SIP Emulator
VQT010 VQuad™ Software (Stand Alone)
VQT013 VQuad™ with SIP (VoIP) Call Control


 
 
Home Page Sitemap Global Presence Email