The 146th meeting of MPEG took place in Rennes from 2024-04-22 until 2024-04-26. Find more information here.
MPEG issues Call for Proposals for AI-based Point Cloud Coding
At the 146th MPEG meeting, MPEG Technical Requirements (WG 2) issued a Call for Proposals (CfP) focusing on AI-based point cloud coding technologies. This initiative stems from ongoing explorations by MPEG into potential use cases, requirements, and the capabilities of AI-driven point cloud encoding, particularly for dynamic point clouds.
With recent significant progress in AI-based point cloud compression technologies, MPEG is keen on studying and adopting AI methodologies. MPEG is specifically looking for learning-based codecs capable of handling a broad spectrum of dynamic point clouds, which are crucial for applications ranging from immersive experiences to autonomous driving and navigation.
As the field evolves rapidly, MPEG expects to receive multiple innovative proposals. These may include a unified codec, capable of addressing multiple types of point clouds, or specialized codecs tailored to meet specific requirements, contingent upon demonstrating clear advantages. MPEG has therefore publicly called for submissions of AI-based point cloud codecs, aimed at deepening the understanding of the various options available and their respective impacts. Submissions that meet the requirements outlined in the call will be invited to provide source code for further analysis, potentially laying the groundwork for a new standard in AI-based point cloud coding. MPEG welcomes all relevant contributions and looks forward to evaluating the responses.
Interested parties are requested to contact the MPEG WG 2 Convenor Igor Curcio (igor.curcio@nokia.com) and MPEG WG 7 Convenor Marius Preda (marius.preda@it-sudparis.eu) to register their participation to the CfP and to submit responses for review at the 148th MPEG meeting in November 2024. Further details are given in the CfP, issued as WG 2 document N 365 and available from https://www.mpeg.org/meetings/mpeg-146/.
MPEG issues Call for Interest in Object Wave Compression
At the 146th MPEG meeting, MPEG Technical Requirements (WG 2) issued a Call for Interest (CfI) in object wave compression. Computer holography, a 3D display technology, utilizes a digital fringe pattern called a computer-generated hologram (CGH) to reconstruct 3D images from input 3D models. Holographic near-eye displays (HNEDs) reduce the need for extensive pixel counts due to their wearable design, positioning the display near the eye. This positions HNEDs as frontrunners for the early commercialization of computer holography, with significant research underway for product development.
Innovative approaches facilitate the transmission of object wave data, crucial for CGH calculations, over networks. Object wave transmission offers several advantages, including independent treatment from playback device optics, lower computational complexity, and compatibility with video coding technology. These advancements open doors for diverse applications, ranging from entertainment experiences to real-time two-way spatial transmissions, revolutionizing fields such as remote surgery and virtual collaboration. As MPEG explores object wave compression for computer holography transmission, a Call for Interest seeks contributions to address market needs in this field.
Interested parties are requested to contact the MPEG WG 2 Convenor Igor Curcio (igor.curcio@nokia.com) or to submit inputs for review at the 147th MPEG meeting in July 2024. Further details are given in the Call for Interest, issued as WG 2 document N 377 and available from https://www.mpeg.org/meetings/mpeg-146/.
MPEG reaches First Milestone for Fifth Edition of Open Font Format
At the 146th MPEG meeting, MPEG Systems (WG 3) promoted the 5th edition of ISO/IEC 14496-22 Open font format to Committee Draft (CD), marking the initial stage of standard development.
The importance of textual representation within multimedia content cannot be understated. In recognition of this, MPEG Systems has diligently pursued the standardization of interoperable font formats. With the commencement of its 5th edition, a pivotal milestone has been achieved. This latest iteration not only enhances the legibility of the specification but also transcends previous limitations, notably the 64K glyph encoding constraint in a single font file. By surpassing this barrier, the new edition facilitates the comprehensive coverage of the entire Unicode repertoire, accommodating diverse world languages and writing systems, including multiple glyph variants, within a singular font file.
Moreover, the latest edition introduces more space-efficient composite glyph representations, along with a myriad of novel features and capabilities tailored for variable fonts. This innovation culminates in substantial reductions in font file sizes and empowers the creation of parametric variable fonts utilizing higher order interpolations.
The development trajectory of this standard is projected for completion, culminating in the attainment of the Final Draft International Standard (FDIS) status by the conclusion of 2025.
MPEG ratifies Second Edition of Scene Description
At the 146th MPEG meeting, MPEG Systems (WG 3) promoted the 2nd edition of ISO/IEC 23090-14 Scene description to Final Draft International Standard (FDIS), the final stage of standard development.
Since the inaugural release of the standard on immersive media scene description in 2022, the momentum in extending its capabilities has remained unwavering. The latest iteration seamlessly integrates two amendments into its predecessor, prioritizing enhanced user readability. Noteworthy advancements include the seamless integration of MPEG-developed immersive media objects, such as Video-based Point Cloud Compression (V-PCC, as specified in ISO/IEC 23090-5), and MPEG Immersive Video (MIV, as delineated in ISO/IEC 23090-12), within a scene framework. Furthermore, this edition fortifies support for a myriad of data types essential for immersive scenes, encompassing haptics, augmented reality, avatars, interactivity, and lighting, among others.
Looking ahead, MPEG Systems is steadfast in its commitment to advancing the standard’s development, with plans to expand support to encompass MPEG-I immersive audio and beyond.
MPEG reaches First Milestone for Second Edition of MPEG Immersive Video (MIV)
At the 146th MPEG meeting, MPEG Video Coding (WG 4) reached the Committee Draft (CD) stage of the 2nd edition of ISO/IEC 23090-12 MPEG immersive video (MIV), the first stage of standard development.
MIV was developed to support the compression of immersive video content, in which multiple real or virtual cameras capture a real or virtual 3D scene. The standard enables the storage and distribution of immersive video content over existing and future networks for playback with 6 degrees of freedom (6DoF) of view position and orientation. MIV is a flexible standard for multi-view video plus depth (MVD) and multi-planar video (MPI) that leverages strong hardware support for commonly used video formats to compress volumetric video.
New features in the 2nd edition are coloured depth, capture device information, patch margins, background views, static background atlases, support for decoder-side depth estimation, chroma dynamic range modification, piecewise linear normalized disparity quantization, and linear depth quantization. These features provide additional functionality and improved performance.
The first edition of the standard included the MIV Main profile for MVD, the MIV Extended profile, which enables MPI, and the MIV Geometry Absent profile, which is suitable for use with cloud-based and decoder-side depth estimation. In the 2nd edition, the MIV 2 profile is being added, which is a superset of the existing profiles and covers all new functionality. In addition, a document entitled “profiles under consideration” was started to study the inclusion of narrower profiles in this edition.
Finally, it is expected that a 2nd edition of ISO/IEC 23090-23 Conformance and reference software for MPEG immersive video will be requested from the next MPEG meeting.
MPEG releases New Editions of AVC, HEVC, and Video CICP
At the 146th MPEG meeting, the MPEG Joint Video Coding Team with ITU-T SG 16 (WG 5), also known as JVET, promoted (i) the 11th edition of ISO/IEC 14496-10 Advanced Video Coding (AVC), (ii) the 5th edition of ISO/IEC 23008-2 High Efficiency Video Coding (HEVC), and (iii) the 3rd edition of ISO/IEC 23091-2 Video Coding-independent Code Points (Video CICP) to Final Draft International Standard (FDIS), the final stage of standard development.
The latest editions of AVC and HEVC now incorporate support for additional SEI messages, drawing from ISO/IEC 23002-7 Versatile Supplemental Enhancement Information (SEI) Messages for Coded Video Bitstreams. Specifically, this includes the integration of (a) neural network post-filtering SEI message and (b) phase indication SEI message with these standards. HEVC has been expanded to include extended multiview profiles for 8-bit and 10-bit, as well as monochrome multiview profiles supporting standalone depth map coding with up to 16 bits. Additionally, the new version of Video CICP introduces additional color code points and implements text improvements and clarifications.
These advancements demonstrate a commitment to maintaining support for legacy standards developed jointly with ITU-T, ensuring their relevance to current market needs.
MPEG Promotes Standard Development for Machine-Optimized Video Compression
At the 146th MPEG meeting, the MPEG Joint Video Experts Team with ITU-T SG16 (WG 5), also known as JVET, advanced ISO/IEC 23888-3 “Optimization of Encoders and Receiving Systems for Machine Analysis of Coded Video Content” as part 3 of MPEG AI to Committee Draft Technical Report (CDTR), marking the initial stage of standard development.
In recent years, the efficacy of machine learning-based algorithms in video content analysis has steadily improved. However, an encoder designed for human consumption does not always produce compressed video conducive to effective machine analysis. This challenge lies not in the compression standard but in optimizing the encoder or receiving system. The forthcoming technical report addresses this gap by showcasing technologies and methods that optimize encoders or receiving systems to enhance machine analysis performance.
Developed collaboratively with ITU-T SG16, this technical report will be published as a technically aligned twin text, corresponding to a forthcoming supplement or technical paper of ITU-T. It is available at https://www.mpeg.org/meetings/mpeg-146/.
MPEG reaches First Milestone for MPEG-I Immersive Audio
At the 146th MPEG meeting, MPEG Audio Coding (WG 6) promoted ISO/IEC 23090-4 MPEG-I immersive audio and ISO/IEC 23090-34 Immersive audio reference software to Committee Draft (CD) stage, the first stage of standard development. The MPEG-I immersive audio standard sets a new benchmark for compact and lifelike audio representation in virtual and physical spaces, catering to Virtual, Augmented, and Mixed Reality (VR/AR/MR) applications. By enabling high-quality, real-time interactive rendering of audio content with six degrees of freedom (6DoF), users can experience immersion, freely exploring 3D environments while enjoying dynamic audio.
Designed in accordance with MPEG’s rigorous standards, MPEG-I immersive audio ensures efficient distribution across bandwidth-constrained networks without compromising on quality. Unlike proprietary frameworks, this standard prioritizes interoperability, stability, and versatility, supporting both streaming and downloadable content while seamlessly integrating with MPEG-H 3D audio compression.
MPEG-I’s comprehensive modeling of real-world acoustic effects, including sound source properties and environmental characteristics, guarantees an authentic auditory experience. Moreover, its efficient rendering algorithms balance computational complexity with accuracy, empowering users to finely tune scene characteristics for desired outcomes.
The release of the CD for ISO/IEC 23090-34 Immersive Audio Reference Software, which encompasses all aspects of the standard, facilitates real-time evaluation and adoption in industry and consumer applications. Interested parties can access both the text specification and reference software publicly at https://www.mpeg.org/meetings/mpeg-146/, with additional insights available through a dedicated white paper released during this meeting.
MPEG reaches First Milestone for
Video-based Dynamic Mesh Coding (V-DMC)
At the 146th MPEG meeting, MPEG Coding of 3D Graphics and Haptics (WG 7) reached the Committee Draft (CD) stage of ISO/IEC 23090-29 Video-based Dynamic Mesh Compression (V-DMC), the first stage of standard development. This standard represents a significant advancement in 3D content compression, catering to the ever-increasing complexity of dynamic meshes used across various applications, including real-time communications, storage, free-viewpoint video, augmented reality (AR), and virtual reality (VR). The standard addresses the challenges associated with dynamic meshes that exhibit time-varying connectivity and attribute maps, which were not sufficiently supported by previous standards.
Video-based Dynamic Mesh Compression promises to revolutionize how dynamic 3D content is stored and transmitted, allowing more efficient and realistic interactions with 3D content globally. The Committee Draft follows an extensive call for proposals issued by MPEG, which invited technology developers to submit innovations that could contribute to the new standard. Proposals were evaluated based on various objective and subjective metrics to ensure the selected technologies meet and exceed the current market and technical demands. MPEG extends its gratitude to all contributors who have submitted proposals and participated in the rigorous testing and evaluation process. The results of these evaluations have shaped the draft of the standard, ensuring it meets the high expectations and needs of the industry.
The Committee Draft of the Video-based Dynamic Mesh Compression standard is now available for further comments and evaluation by national bodies. It is available at https://www.mpeg.org/meetings/mpeg-146/. MPEG encourages continued participation from the community to finalize the standard for publication.
MPEG reaches First Milestone for Low Latency, Low Complexity LIDAR coding
At the 146th MPEG meeting, MPEG Coding of 3D Graphics and Haptics (WG 7) reached the Committee Draft (CD) stage of ISO/IEC 23090-30 Low Latency, Low Complexity LiDAR Coding, the first stage of standard development. This milestone underscores MPEG’s commitment to advancing coding technologies required by modern LiDAR applications across diverse sectors. The new standard addresses critical needs in the processing and compression of LiDAR-acquired point clouds, which are integral to applications ranging from automated driving to smart city management. It provides an optimized solution for scenarios requiring high efficiency in both compression and real-time delivery, responding to the increasingly complex demands of LiDAR data handling.
LiDAR technology has become essential for various applications that require detailed environmental scanning, from autonomous vehicles navigating roads to robots mapping indoor spaces. The Low Latency, Low Complexity LiDAR Coding standard will facilitate a new level of efficiency and responsiveness in LiDAR data processing, which is critical for the real-time decision-making capabilities needed in these applications.
This Committee Draft builds on comprehensive analysis and industry feedback to address specific challenges such as noise reduction, temporal data redundancy, and the need for region-based quality of compression. The standard also emphasizes the importance of low latency coding to support real-time applications, essential for operational safety and efficiency in dynamic environments.
Key applications highlighted for the new standard include:
- Automotive Industry: enhancing driver assistance systems and self-driving functionalities through efficient and rapid processing of road and environmental data.
- Robotics: optimizing navigation and operational efficiency in automated robots.
- Surveillance: Supporting advanced security systems with combined video and LiDAR data processing capabilities.
- Aerial Drones: enabling safer and more effective use of drones in professional and emergency scenarios through improved obstacle detection and environmental mapping.
- Industrial Automation: enhancing precision and safety in industrial applications through better tracking and positioning of machinery.
The Committee Draft is available at https://www.mpeg.org/meetings/mpeg-146/.
MPEG White Paper
At the 146th MPEG meeting, MPEG Liaison and Communication (AG 3) approved the following MPEG white paper, available at https://www.mpeg.org/whitepapers/.
White paper on MPEG-I Immersive Audio
The MPEG-I immersive audio standard aims at providing a convincing solution for compact representation and for high-quality real-time interactive rendering of virtual audio content with six degrees of freedom (6DoF), i.e., the user can not only turn his/her head in all directions (pitch/yaw/roll) but also move around freely in 3D space.
By exploring the 6DoF virtual world, many acoustic effects of the real world must be modeled accurately to provide a realistic user experience, including properties of sound sources (e.g., level, size, radiation/directivity characteristics, Doppler processing) as well as effects of the acoustic environment (e.g., sound reflections and reverberation, diffraction, total- and partial occlusion). MPEG-I immersive audio features a plethora of technology components that support computationally efficient rendering of such aspects. Distinguishing from many existing technologies, it offers scene descriptions using physics-inspired metadata (for easier scene authoring from CAD scenes and material databases) and possibilities for artistic tuning of the scene characteristics to achieve the desired results.
During the standardization process, extensive listening test comparisons and evaluations were conducted.