VQT Detailed Analysis
VQT detailed analysis includes following measurements:
Jitter data is obtained from the time alignment process. The utterance-by-utterance offset must be determined accurately to get
a speech quality measure. Jitter is the variation in time offset between reference and degraded utterances. The GL VQT reports
utterance offset by providing a minimum/maximum and standard deviation value. These three are measures of jitter in the speech
as delivered to the listener. GL also reports the average offset.
Performance Examiner provides a number of diagnostic outputs that relate to the use of muting algorithms and discontinuous
transmission. These outputs are generated by comparing the degraded signal to the reference signal.
Muting of a signal typically occurs when an error concealment algorithm at a receiver has insufficient information to replace
missing or corrupted data. The muting estimate is provided in terms of the proportion of signal frames that have been muted by
the system under test.
Discontinuous transmission (DTX) schemes aim to increase transmission efficiency by ceasing transmission during periods of
talker inactivity. Temporal clipping occurs when the voice activity detection (VAD) algorithm in a DTX system misclassifies part of
a speech utterance as noise, and replaces it with comfort noise at the receiver.
Front-end clipping refers to the case where the start of an utterance has been clipped. Back-end clipping refers the case
where the end of an utterance has been clipped.
Hangover is a term applied to the period after the end of an utterance when a discontinuous transmission scheme continues
to transmit as normal, rather than generating comfort noise.
For each measurement levels are calculated for the reference and degraded files. These levels are described below:
|Active Speech Level (ASL) (dBov)
||Power Level (RMS) during periods of speech
|Mean Noise Level (MNL) (dBov)
||Power Level (RMS) during periods of silence
|RMS Mean Level (dBov)
||Power Level (RMS) of entire sample
|DC Offset (PCM Units)
||DC Offset of input sample
The following results are interpreted from the data above:
|Speech Level Gain (dB)
||Speech Level Gain of the system under test. Calculated: (ASL of degraded signal) minus (ASL of reference signal).
|Noise Level Gain (dB)
||Gain calculated for noise in silent periods. Calculated as (MNL of degraded signal) minus (MNL of reference signal). May differ
from the system gain if noise is added or suppressed.
A PESQ/PESQ LQ/PESQ LQO score is available on a per utterance basis. Each sample is broken into distinctive utterances,
which GL provides an ITU score for each of the utterances.
For example front end clipping, which would only affect the 1st utterance, could cause the overall scores to be lower than
expected. PESQ/Utterance will indicate this cause.
The delay per utterance results are acquired by comparing the beginning of each utterance in the reference file to the beginning
of each utterance in the degraded file. This comparison takes place for each utterance in the reference and degraded files.
* Specifications are subject to change without notice.
Back to Complete Voice Quality Testing
Solutions Index Page