Object-Based Audio Rendering

February 23, 2026

Reading time: 5 minute

...

📝 Original Info

Title: Object-Based Audio Rendering
ArXiv ID: 1708.07218
Date: 2017-08-25
Authors: Researchers from original ArXiv paper

📝 Abstract

Apparatus and methods are disclosed for performing object-based audio rendering on a plurality of audio objects which define a sound scene, each audio object comprising at least one audio signal and associated metadata. The apparatus comprises: a plurality of renderers each capable of rendering one or more of the audio objects to output rendered audio data; and object adapting means for adapting one or more of the plurality of audio objects for a current reproduction scenario, the object adapting means being configured to send the adapted one or more audio objects to one or more of the plurality of renderers.

💡 Deep Analysis

Deep Dive into Object-Based Audio Rendering.

📄 Full Content

© 2017, University of Surrey, All rights reserved Object-Based Audio Rendering Philip Jackson1, Filippo Fazi2, Frank Melchior3, Trevor Cox4, Adrian Hilton1, Chris Pike3, Jon Francombe5, Andreas Franck2, Philip Coleman1, Dylan Menzies-Gow2, James Woodcock4, Yan Tang4, Qingju Liu1, Rick Hughes4, Marcos Simon Galvez2, Teo de Campos1, Hansung Kim1, Hanne Stenzel1

24 August 2017

1 Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, UK 2 Institute of Sound & Vibration Research (ISVR), University of Southampton, UK 3 Audio Research, BBC R&D, UK 4 Acoustics Research Centre (ARC), University of Salford, UK 5 Institute of Sound Recording (IoSR), University of Surrey, UK

Overview

This document provides a transcript of GB Patent Application No: GB1609316.3, which was filed in the UK by the University of Surrey on 23 May 2016. It describes an intelligent system for customising, personalising and perceptually monitoring the rendering of an object-based audio stream for an arbitrary connected system of loudspeakers to optimize the listening experience as the producer intended.

Acknowledgements

The development of the concepts and implementation were supported by the EPSRC Programme Grant S3A: Future Spatial Audio for an Immersive Listener Experience at Home (EP/L000539/1). The authors would like to thank Chelsea Brain and Rob Yates of the University of Surrey and Rob Cork of Venner Shipley LLP for their assistance in the preparation of this document.

1 - © 2017, University of Surrey, All rights reserved Object-Based Audio Rendering

Technical Field The present invention relates to object-based audio rendering.

Background Systems and methods for reproducing audio can broadly be categorised as either channel-based or object-based. In channel-based audio, a soundtrack is created by recording a separate audio track (channel) for each speaker. Common speaker arrangements for channel-based surround sound systems are 5.1 and 7.1, which utilise five and seven surround channels respectively, and one low-frequency channel. A major drawback of channel-based audio is that each soundtrack must be created for a specific speaker configuration, hence the development of industry-standard configurations such as 2.1 (stereo), 5.1 and 7.1.

Object-based audio addresses this drawback by representing a sound scene as multiple separate audio objects, each of which comprises one or more audio signals and associated metadata. Each audio object is associated with metadata that defines a location and trajectory of that object in the scene. Object-based audio rendering involves the rendering of audio objects into loudspeaker signals to reproduce the authored sound scene. As well as specifying the location and movement of an object, the metadata can also define the type of object and the class of renderer that should be used to render the object. For example, an object may be identified as being a diffuse object or a point source object. Typically, object-based renderers use the positional metadata with a rendering algorithm that is specific to the particular object type to pan each object over a wide variety of conformal loudspeaker arrangements, based on knowledge of the loudspeaker directions from the predefined ‘sweet spot’ listening position.

Object-based renderers provide greater flexibility than channel-based audio systems, insofar as they can cope with different numbers of speakers. In practice, however, existing object-based methods can only cope with a very limited degree of irregularity, and often require a static sweet spot, gain and phase compensation. Existing object- based methods can also suffer from a small sweet spot. There is therefore a need in the art for improved object-based rendering apparatuses and methods.

Summary of the Invention According to a first aspect of the present invention, there is provided apparatus for performing object-based audio rendering on a plurality of audio objects which define a sound scene, each audio object comprising at least one audio signal and associated metadata, the apparatus comprising: a plurality of renderers each capable of rendering one or more of the audio objects to output rendered audio data; and object adapting means for adapting one or more of the plurality of audio objects for a current reproduction scenario, the object adapting means being configured to send the adapted one or more audio objects to one or more of the plurality of renderers.

In some embodiments according to the first aspect, the object adapting means comprises: a scene adapter configured to adapt the sound scene for the current reproduction scenario by adapting an audio signal and/or metadata of one or more of the audio objects; and an object refiner configured to receive t

…(Full text truncated)…

📄 Read Full PDF on ArXiv

📸 Image Gallery

Reference

This content is AI-processed based on ArXiv data.

Object-Based Audio Rendering

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Related Posts

Employing Second-Order Circular Suprasegmental Hidden Markov Models to Enhance Speaker Identification Performance in Shouted Talking Environments

End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks

Intelligent System for Speaker Identification using Lip features with PCA and ICA

Start searching

No results found