Towards the automatic assessment of spatial quality in the reproduced sound environment

Research Student: Dr Rob Conetta
Principal Supervisor: Dr Francis Rumsey
Co-supervisor: Dr Slawek Zielinski,
Thesis Supervisor: Dr Tim Brookes
Industrial partners: Prof Søren Bech, Bang & Olufsen, Denmark. David Meares, BBC Research and Development

Start date: 2006
End date: 2011

Part of QESTRAL (EPSRC Project Reference: EP/D041244/1)

Project Outline

This project formed part of QESTRAL (Quality Evaluation of Spatial Transmission and Reproduction using an Artificial Listener), creating and deveoping a method for the prediction of perceived spatial quality. The QESTRAL model is an objective evaluation model capable of accurately predicting changes to perceived spatial quality. It uses probe signals and a set of objective metrics to measure changes to low-level spatial attributes. A polynomial weighting function derived from regression analysis is used to predict data from listening tests, which employed spatial audio processes (SAPs) proven to stress those low-level attributes.

A listening test method was developed for collecting listener judgements of impairments to spatial quality. This involved the creation of a novel test interface to reduce the biases inherent in other similar audio quality assessment tests. Pilot studies were undertaken which established the suitability of the method.

Two large scale listening tests were conducted using 31 Tonmeister students from the Institute of Sound Recording (IoSR), University of Surrey. These tests evaluated 48 different SAPs, typically encountered in consumer sound reproduction equipment, when applied to 6 types of programme material. The tests were conducted at two listening positions to determine how perceived spatial quality was changed.

Analysis of the data collected from these listening tests showed that the SAPs created a diverse range of judgements that spanned the range of the spatial quality test scale and that listening position, programme material type and listener each had a statistically significant influence upon perceived spatial quality. These factors were incorporated into a database of 308 responses used to calibrate the model.

The model was calibrated using partial least-squares regression using target specifications similar to those of audio quality models created by other researchers. This resulted in five objective metrics being selected for use in the model. A method of post correction using an exponential equation was used to reduce non-linearity in the predicted results, thought to be caused by the inability of some metrics to scrutinise the highest quality SAPs. The resulting model had a correlation (r) of 0.89 and an error (RMSE) of 11.06% and performs similarly to models developed by other researchers. Statistical analysis also indicated that the model would generalise to a larger population of listeners.

Publications

Journal Papers
Conference & Convention Papers
Conference abstracts
Poster
Thesis