Audio/Video Frame Alignment


In current digital broadcast systems, encoded audio and video utilize different frame rates. Although in isolation these rates make sense, the combination of two different rates in a final delivery stream or package makes further manipulation of the program in the transport domain complex. Applications such as editing, ad insertion, and international turnarounds become challenging to implement, as the switching points at the end of video frames do not align with the ends of audio frames. If not implemented carefully, this can result in sync errors between video and audio or audible audio errors. Current solutions to this involve decoding and re-encoding the audio, which introduces potential sync errors, quality loss, and, in the case of object audio, misalignment of audio and time-critical positional metadata. In Dolby AC-4, a new approach is taken. The Dolby AC-4 encoder features an optional video reference input to align the audio and video frames. The encoded audio frame rate can therefore be set to match the video frame rate, and as a result, the boundaries of the audio frames can be precisely aligned with the boundaries of the video frames. The Dolby AC-4 system accommodates current broadcast standards, which specify video rates from 23.97 to 60 Hz as well as support for rates up to 120 Hz for new ultra high-definition specifications.


Advanced Loudness Management


To help services ensure loudness consistency and compliance with regulations, the Dolby AC-4 encoder incorporates integrated intelligent loudness management. The encoder assesses the loudness of incoming audio and can, if desired, update the loudness metadata (dialnorm) to the correct value or signal the multiband processing required to bring the program to the target loudness level. Rather than processing the audio in the encoder, this information is added to the bitstream in the DRC metadata so processing can be applied downstream in the consumer device appropriately for the playback scenario. The process is therefore non-destructive; the original audio is carried in the bitstream and available for future applications.

To avoid the problems associated with cascaded leveling processes, Dolby AC-4 makes use of the extended metadata framework standardized in the European Telecommunications Standards Institute (ETSI) 102 366 Annex H. This framework carries information about the loudness processing history of the content so that downstream devices can intelligently disable or adjust their processing accordingly, maximizing quality while maintaining consistency. Annex H metadata can be carried throughout the program chain, either with the baseband audio prior to final encoding or inserted into transmission bitstreams including Dolby Digital Plus and Dolby AC-4. If the incoming audio presented to the Dolby AC-4 encoder has previously been produced or adjusted to a target loudness level by a trusted device, this can be signalled to the encoder using Annex H metadata. In this case, the integrated loudness leveling processing of the encoder will be automatically disabled, so that the audio is delivered without further adjustment, maximizing quality and preserving the original creative decisions.


Improved Dynamic Range Control


The Dolby AC-4 decoder applies DRC to tailor the dynamic range and the typical output level to suit the listening scenario. Dolby AC-4 supports a number of DRC modes to adapt the content to different listening environments and playback scenarios. Each mode is associated with a type of playback device and has guidelines for decoder-defined playback reference levels.

DRC parameters for each output mode are generated by the Dolby AC-4 encoder or by an external third-party processor and transported in the Dolby AC-4 bitstream as DRC gain values (wideband or multiband). Alternatively, the desired DRC characteristic can be expressed in the bitstream as a parameterized compression curve.



Parameterized Curves


This curve can be created by the service provider or content creator to suit the content and their house style. These curves may be selected from a number of presets in the encoder or may be customized if desired. Parameterized compression curves provide benefits such as lower bit-rate overhead and higher audio quality for traditional channel audio content, with even larger gains for object-based and immersive audio content, where a DRC gain per channel or object, as used in other systems, becomes costly.



Third-Party Gains


The Dolby AC-4 decoder calculates gains based on the transmitted compression curve or gains and the target playback reference level of the device. The target playback reference level for each mode is not fixed but instead can be defined in the decoder. This enables flexibility to match the loudness of other content sources depending on the listening scenario.


Please see the Dolby AC-4 White Paper for more information.