148th meeting of MPEG

The 148^th meeting of MPEG took place in Kemer from 2024-11-04 until 2024-11-08. Find more information here.

MPEG advances AI-based Point Cloud Coding

At its 146^th meeting in April 2024, MPEG issued a Call for Proposals (CfP) to explore innovative AI-based Point Cloud Coding technologies, with the goal of enhancing compression techniques for diverse point cloud data. The CfP addressed the full range of point cloud formats, from dense point clouds used in immersive applications to sparse point clouds generated by Light Detection and Ranging (LiDAR) sensors in autonomous driving. With bit depths ranging from 10 to 18 bits, the CfP called for solutions that could meet the precision requirements of these varied use cases.

At the 148^th MPEG meeting, MPEG Coding of 3D Graphics and Haptics (WG 7) reviewed six responses to the CfP. The leading proposal distinguished itself with a hybrid coding strategy that integrates end-to-end learning-based geometry coding and traditional attribute coding. This proposal demonstrated exceptional adaptability, capable of efficiently encoding both dense point clouds for immersive experiences and sparse point clouds from LiDAR sensors. With its unified design, the system supports inter-prediction coding using a shared model with intra-coding, applicable across various bitrates without retraining. Furthermore, the proposal offers flexible configurations for both lossy and lossless geometry coding.

Performance assessments highlighted the leading proposal’s effectiveness, with significant bitrate reductions compared to traditional codecs: a 47% reduction for dense, dynamic sequences in immersive applications and a 35% reduction for sparse dynamic sequences in LiDAR data. For combined geometry and attribute coding, it achieved a 40% bitrate reduction across both dense and sparse dynamic sequences, while subjective evaluations confirmed its superior visual quality over baseline codecs. Typically, 0.2 Mbps – 10 Mbps is required to code the geometry of a dynamic PC sequence.

Encouraged by these promising results, MPEG launched a new AI-based Point Cloud Coding standardization project under WG 7. The leading proposal has been chosen as the initial test model, with a working draft and a common test condition document expected shortly after the 148^th meeting. WG 7 extends its appreciation to all contributors who participated in the CfP, recognizing the invaluable role their proposals played in shaping the project’s success.

MPEG remains dedicated to advancing AI-driven point cloud coding technologies in upcoming meetings, gathering further insights to improve compression efficiency and quality. This new system is expected to integrate seamlessly with existing AI ecosystems by utilizing AI-based coding methods, delivering globally optimized, high-performance, and scalable 3D data applications.

The standard is expected to be finalized and reach the status of Final Draft International Standard (FDIS) in 2026.

MPEG ratifies a New Part of MPEG DASH for Redundant Encoding and Packaging

At its 148^th meeting, MPEG Systems (WG 3) completed work on a new part of MPEG DASH, ISO/IEC 23009-9, which addresses redundant encoding and packaging for segmented live media (REAP). This advancement was promoted to the Final Draft International Standard (FDIS) stage, marking the final step in the standard development process.

The standard is designed for scenarios where redundant encoding and packaging are essential, such as 24/7 live media production and distribution in cloud-based workflows. It specifies formats for interchangeable live media ingest and stream announcements, as well as formats for generating interchangeable media presentation descriptions. Additionally, it provides failover support and mechanisms for reintegrating distributed components in the workflow, whether they involve file-based content, live inputs, or a combination of both.

MPEG ratifies Reference Software and Conformance of ISOBMFF

At its 148^th meeting, MPEG Systems (WG 3) advanced the 2^nd edition of the ISO/IEC 14496-32 file format reference software and conformance standard to the Final Draft International Standard (FDIS) stage, the final step in the standard development process.

ISO/IEC 14496-32 includes an extensive collection of conformance bitstreams and reference software assets for the ISO/IEC 14496-12 ISO Base Media File Format (ISOBMFF) and related standards, such as ISO/IEC 14496-15 for the carriage of network abstraction layer (NAL) unit-structured video in the ISO base media file format, and ISO/IEC 23008-12 Image File Format, among others.

This standard provides both reference implementations to support product innovation and essential bitstreams for conformance testing, ensuring robustness and reliability in real-world applications. The standard will be made freely available for download on the official ISO website [https://www.iso.org], promoting widespread access for industry professionals, researchers, and enthusiasts. This commitment to openness and accessibility aligns with MPEG Systems’ mission to contribute to the broader technological community and promote collaboration.

MPEG reaches the First Milestone of a New Structural CMAF Brand Profile

At its 148^th meeting, MPEG Systems (WG 3) advanced Amendment 2 of ISO/IEC 23000-19, which introduces a new structural CMAF brand profile, to Committee Draft Amendment (CDAM) status. This marks the initial phase of the standard development process, aimed at supporting emerging applications.

The amendment introduces several new media profiles for the Common Media Application Format (CMAF), designed to support Multi-View High Efficiency Video Coding (MV-HEVC), a stereoscopic video format intended for head-mounted displays. Additionally, this amendment defines a new CMAF brand, lifting the restriction on carrying a single track per fragment. This change enables the multiplexing of closely associated metadata tracks alongside media data.

The standard is expected to be finalized and reach the status of Final Draft Amendment (FDAM) by the end of 2025.

MPEG reaches the First Milestone of a new Part of MPEG-AI

At its 148^th meeting, the MPEG Joint Video Experts Team (JVET) with ITU-T SG 16 (WG 5 / JVET) advanced Part 3 of MPEG-AI, ISO/IEC 23888-3 – Optimization of encoders and receiving systems for machine analysis of coded video content – to Committee Draft Technical Report (CDTR) status, marking the initial stage of the standard development process.

This new technical report is based on software experiments conducted by JVET, focusing on optimizing non-normative elements such as preprocessing, encoder settings, and postprocessing. The research explored scenarios where video signals, decoded from bitstreams compliant with the latest video compression standard, ISO/IEC 23090-3 – Versatile Video Coding (VVC), are intended for input into machine vision systems rather than for human viewing. Compared to the JVET VVC reference software encoder, which was originally optimized for human consumption, significant bit rate reductions were achieved when machine vision task precision was used as the performance criterion.

The report will include an annex with example software implementations of these non-normative algorithmic elements, applicable to VVC or other video compression standards. Additionally, it will explore the potential use of existing supplemental enhancement information messages from ISO/IEC 23002-7 – Versatile supplemental enhancement information messages for coded video bitstreams – for embedding metadata useful in these contexts.

The technical report is expected to be finalized, achieving the status of Technical Report (TR), by the end of 2025.

MPEG reaches the First Milestone of the Second Edition of Conformance and Reference Software for MPEG Immersive Video

At the 148^th MPEG meeting, MPEG Video Coding (WG 4) reached the committee draft stage of ISO/IEC 23090-23 Conformance and reference software for MPEG immersive video (MIV) 2^nd edition. The document specifies how to conduct conformance tests and provides reference encoder and decoder software for ISO/IEC 23090-12 MPEG immersive video 2^nd edition. This draft includes verified and validated conformance bitstreams and encoding and decoding reference software based on version 22 of the Test model for MPEG immersive Video (TMIV). The test model, objective metrics, and some other tools are publicly available at https://gitlab.com/mpeg-i-visual.

MIV was developed to support the compression of immersive video content, in which multiple real or virtual cameras capture a real or virtual 3D scene. The standard enables the storage and distribution of immersive video content over existing and future networks for playback with 6 degrees of freedom (6DoF) of view position and orientation. MIV is a flexible standard for multi-view video plus depth (MVD) and multi-planar video (MPI) that leverages strong hardware support for commonly used video formats to compress volumetric video.

New features in the 2^nd edition are, in no particular order, coloured depth, capture device information, patch margins, background views, static background atlases, support for decoder-side depth estimation, chroma dynamic range modification, piecewise linear normalized disparity quantization, and linear depth quantization. These features provide additional functionality or improved performance.

The first edition of the standard included the MIV Main profile for MVD, the MIV Extended profile, which enables MPI, and the MIV Geometry Absent profile, which is suitable for use with cloud-based and decoder-side depth estimation. In the second edition, the MIV 2 profile is being added that is a superset of the existing profiles and covers all new functionality. Additionally, a profile under consideration document was started to study the inclusion of narrower profiles in this edition.

MPEG advances Point Cloud Coding with Enhanced G-PCC Standardization

At its 148^th meeting, MPEG Coding of 3D Graphics and Haptics (WG 7) made substantial progress in standardizing Geometry-based Point Cloud Compression (G-PCC), a key technology for efficiently managing large 3D data sets, such as those used in virtual reality, autonomous vehicles, and immersive multimedia applications. To balance the need for legacy compatibility with advancements, MPEG has opted to release the new version as an additional part of MPEG-I, titled Enhanced G-PCC, rather than as a second edition. This approach ensures that users of the original G-PCC standard can continue utilizing it as required.

Enhanced G-PCC introduces several advanced features to improve the compression and transmission of 3D point clouds. Notable enhancements include inter-frame coding, refined octree coding techniques, Trisoup surface coding for smoother geometry representation, and dynamic Optimal Binarization with Update On-the-fly (OBUF) modules. These updates provide higher compression efficiency while managing computational complexity and memory usage, making them particularly advantageous for real-time processing and high visual fidelity applications, such as LiDAR data for autonomous driving and dense point clouds for immersive media.

By adding this new part to MPEG-I, MPEG addresses the industry’s growing demand for scalable, versatile 3D compression technology capable of handling both dense and sparse point clouds. Enhanced G-PCC provides a robust framework that meets the diverse needs of both current and emerging applications in 3D graphics and multimedia, solidifying its role as a vital component of modern multimedia systems.

MPEG expresses its gratitude to all experts and contributors for their efforts in this significant achievement and looks forward to further refining the Enhanced G-PCC standard in future meetings.

MPEG completes Subjective Quality Testing for Film Grain Synthesis using the Film Grain Characteristics SEI Message

At the 148^th MPEG meeting, the MPEG Joint Video Experts Team (JVET) with ITU-T SG 16 (WG 5 / JVET) and MPEG Visual Quality Assessment (AG 5) completed a formal expert viewing experiment evaluating the impact of film grain synthesis on the subjective quality for film grain synthesis controlled by the Film Grain Characteristics (FGC) SEI message. The evaluation demonstrates the ability of film grain synthesis to mask compression artifacts of the underlying coding scheme. For the test, the FGC SEI messages were tailored to a set of example video sequences covering a wide range of grain characteristics, such as scans of original film material, digital camera noise, and artificially inserted synthetic film grain added to digitally captured video. The test compared the subjective performance with and without film grain synthesis using VVC and HEVC bitstreams, demonstrating the beneficial impact of this technology on the subjective impression of the reconstructed video. The results reveal a superior subjective performance at all tested bitrates, providing bitrate savings of up to a factor of 10 for some test points.