Audio Commons: an Ecosystem for Creative Reuse of Audio Content

European Commission Horizon 2020 grant ref. 688382

Further information about the project and its partners can be found at www.audiocommons.org

The IoSR researchers contributing to the project are:

Research Fellow: Andy Pearce
Principal Supervisor: Dr Tim Brookes
Co-Supervisor: Dr Russell Mason

Start date: 2016
End date: 2019

Project Outline

The Audio Commons Initiative aimed at promoting the use of open audio content and at developing technologies with which to support an ecosystem of content repositories, production tools and users (the Audio Commons Ecosystem). These technologies should enable the reuse of this audio material, facilitating its integration in the production workflows of the creative industries.

The IoSR's role was to determine the timbral attributes most useful to automatic characterisation of sound library elements, and to research, develop and evaluate automatic methods for semantically annotating sound's timbral characteristics such that libraries may be explored using these timbral characteristics as search terms.

Timbral Hierarchy & Sunburst Plots

The first part of this work identified the timbral attributes that have potential to add value to online audio as automatically-generated tags. A dictionary of timbral attributes and terms was compiled from the relevant academic literature, and this dictionary was then structured into a hierarchy. The frequency of use of each dictionary term, in online searches for audio content, was then established in order to give an indication of the potential value of each term.

Interactive sunburst plots showing the structure of the timbral hierarchy and the terms which comprise each attribute can be found at sunburst.php.

Modelling

Some of the most important timbral attributes were then investigated acoustically and perceptually, using series of listening tests and acoustic analyses, and algorithms were developed to extract from any sound the acoustic features most relevant to its timbral characteristics. Combinations of these algorithms were then developed into computer models that can predict human perception of the attributes hardness, depth, brightness, warmth, roughness, sharpness, boominess, and reverberation.

The models are still in a prototype stage, and improvements are being made regularly to better fit the subjective ratings of each attribute. They are implemented in Python and are available from the project's GitHub page (for the most up-to-date versions) or installable from PyPI (updated for significant changes).

Currently, each model is coded as an independent function, but there are future plans to develop a timbral extractor function to efficiently extract all timbral attributes with a single call, and have the models integrated into the Audio Commons Extractor.

Timbral Explorer

The Freesound Explorer, developed by the MTG at UPF, has now been modified to make use of data generated by our timbral models so that it can search and retrieve sounds from freesound.org and graphically distribute them according to their timbres. The interface is presented below. Any desired source type can be specified in the text field (top right), and the x- and y-axis timbral dimensions can be selected via the drop-down menus. (If you are unable to hear the selected sounds then please try a different browser; problems have been experienced e.g. with some versions of Apple's Safari.)

The Timbral Explorer is available in full-screen at andyp103.github.io/FreesoundTimbralSearch.

Publications

  • A.Pearce, T.Brookes, R.Mason (2021) Modelling the Microphone-Related Timbral Brightness of Recorded Signals, Applied Sciences Special Issue on Applications of Machine Learning in Audio Classification and Acoustic Scene Characterization, vol.11, iss.14, article number 6461.

    Data Archive

    The data generated by this project (including code and results) are available in these repositories: