Apparatus and methods are disclosed for performing object-based audio rendering on a plurality of audio objects which define a sound scene, each audio object comprising at least one audio signal and associated metadata. The apparatus comprises: a plurality of renderers each capable of rendering one or more of the audio objects to output rendered audio data; and object adapting means for adapting one or more of the plurality of audio objects for a current reproduction scenario, the object adapting means being configured to send the adapted one or more audio objects to one or more of the plurality of renderers.
Deep Dive into Object-Based Audio Rendering.
Apparatus and methods are disclosed for performing object-based audio rendering on a plurality of audio objects which define a sound scene, each audio object comprising at least one audio signal and associated metadata. The apparatus comprises: a plurality of renderers each capable of rendering one or more of the audio objects to output rendered audio data; and object adapting means for adapting one or more of the plurality of audio objects for a current reproduction scenario, the object adapting means being configured to send the adapted one or more audio objects to one or more of the plurality of renderers.
© 2017, University of Surrey, All rights reserved
Object-Based Audio Rendering
Philip Jackson1, Filippo Fazi2, Frank Melchior3, Trevor Cox4, Adrian Hilton1,
Chris Pike3, Jon Francombe5, Andreas Franck2, Philip Coleman1, Dylan Menzies-Gow2,
James Woodcock4, Yan Tang4, Qingju Liu1, Rick Hughes4, Marcos Simon Galvez2,
Teo de Campos1, Hansung Kim1, Hanne Stenzel1
24 August 2017
1 Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, UK
2 Institute of Sound & Vibration Research (ISVR), University of Southampton, UK
3 Audio Research, BBC R&D, UK
4 Acoustics Research Centre (ARC), University of Salford, UK
5 Institute of Sound Recording (IoSR), University of Surrey, UK
Overview
This document provides a transcript of GB Patent Application No: GB1609316.3, which
was filed in the UK by the University of Surrey on 23 May 2016. It describes an
intelligent system for customising, personalising and perceptually monitoring the
rendering of an object-based audio stream for an arbitrary connected system of
loudspeakers to optimize the listening experience as the producer intended.
Acknowledgements
The development of the concepts and implementation were supported by the EPSRC
Programme Grant S3A: Future Spatial Audio for an Immersive Listener Experience at
Home (EP/L000539/1). The authors would like to thank Chelsea Brain and Rob Yates
of the University of Surrey and Rob Cork of Venner Shipley LLP for their assistance in
the preparation of this document.
- 1 -
© 2017, University of Surrey, All rights reserved
Object-Based Audio Rendering
Technical Field
The present invention relates to object-based audio rendering.
Background
Systems and methods for reproducing audio can broadly be categorised as either
channel-based or object-based. In channel-based audio, a soundtrack is created by
recording a separate audio track (channel) for each speaker. Common speaker
arrangements for channel-based surround sound systems are 5.1 and 7.1, which utilise
five and seven surround channels respectively, and one low-frequency channel. A
major drawback of channel-based audio is that each soundtrack must be created for a
specific speaker configuration, hence the development of industry-standard
configurations such as 2.1 (stereo), 5.1 and 7.1.
Object-based audio addresses this drawback by representing a sound scene as multiple
separate audio objects, each of which comprises one or more audio signals and
associated metadata. Each audio object is associated with metadata that defines a
location and trajectory of that object in the scene. Object-based audio rendering
involves the rendering of audio objects into loudspeaker signals to reproduce the
authored sound scene. As well as specifying the location and movement of an object,
the metadata can also define the type of object and the class of renderer that should be
used to render the object. For example, an object may be identified as being a diffuse
object or a point source object. Typically, object-based renderers use the positional
metadata with a rendering algorithm that is specific to the particular object type to pan
each object over a wide variety of conformal loudspeaker arrangements, based on
knowledge of the loudspeaker directions from the predefined ‘sweet spot’ listening
position.
Object-based renderers provide greater flexibility than channel-based audio systems,
insofar as they can cope with different numbers of speakers. In practice, however,
existing object-based methods can only cope with a very limited degree of irregularity,
and often require a static sweet spot, gain and phase compensation. Existing object-
based methods can also suffer from a small sweet spot. There is therefore a need in the
art for improved object-based rendering apparatuses and methods.
- 2 -
© 2017, University of Surrey, All rights reserved
The invention is made in this context.
Summary of the Invention
According to a first aspect of the present invention, there is provided apparatus for
performing object-based audio rendering on a plurality of audio objects which define a
sound scene, each audio object comprising at least one audio signal and associated
metadata, the apparatus comprising: a plurality of renderers each capable of rendering
one or more of the audio objects to output rendered audio data; and object adapting
means for adapting one or more of the plurality of audio objects for a current
reproduction scenario, the object adapting means being configured to send the adapted
one or more audio objects to one or more of the plurality of renderers.
In some embodiments according to the first aspect, the object adapting means
comprises: a scene adapter configured to adapt the sound scene for the current
reproduction scenario by adapting an audio signal and/or metadata of one or more of
the audio objects; and an object refiner configured to receive t
…(Full text truncated)…
This content is AI-processed based on ArXiv data.