The 143rd meeting of MPEG took place in Geneva from 2023-07-17 until 2023-07-21. Find more information here.
MPEG finalizes the Carriage of Uncompressed Video and Images in ISOBMFF
At the 143rd MPEG meeting, MPEG Systems (WG 3) finalized ISO/IEC 23001-17 – Carriage of uncompressed video and images in ISO Base Media File Format (ISOBMFF) – by promoting it to the Final Draft International Standard (FDIS) stage. The ISOBMFF supports the carriage of a wide range of media data such as video, audio, point clouds, haptics, etc., which has been now further expanded to uncompressed video and images.
ISO/IEC 23001-17 specifies how uncompressed 2D image and video data is carried in files that comply with the ISOBMFF family of standards. This encompasses a range of data types, including monochromatic and colour data, transparency (alpha) information, and depth information. The standard enables the industry to effectively exchange uncompressed video and image data while utilizing the helpful information provided by the ISOBMFF, such as timing, colour space, and sample aspect ratio for interoperable interpretation and/or display of video and image data.
MPEG reaches the First Milestone for two ISOBMFF Enhancements
At the 143rd MPEG meeting, MPEG Systems (WG 3) enhanced the capabilities of the ISO Base Media File Format (ISOBMFF) family of standards by promoting two standards to their first milestone, Committee Draft Amendment (CDAM):
- ISO/IEC 14496-12 (8th edition) CDAM 1 – Support for T.35, original sample duration, and other improvements – will enable the carriage of the user data registered as specified in ITU-T Rec. T.35 as part of the media sample data. It also supports a more efficient way of describing subsamples by referencing the same features defined by other subsamples.
- ISO/IEC 14496-15 (6th edition) CDAM 3 – Support for neural-network post-filter supplemental enhancement information and other improvements – will enable the carriage of the newly defined Supplemental Enhancement Information (SEI) messages for neural-network post-filters in ISOBMFF. The carriage of the neural-network post-filter characteristics (NNPFC) SEI message and the neural-network post-filter activation (NNPFA) SEI message enable the delivery of a base post-processing filter and a series of neural network updates synchronized with the input video pictures.
Both standards are planned to be completed, i.e., to reach the status of Final Draft Amendment (FDAM), by the end of 2024.
MPEG ratifies Third Editions of VVC and VSEI
At the 143rd MPEG meeting, the MPEG Joint Video Experts Team with ITU-T SG 16 (WG 5) issued the Final Draft International Standard (FDIS) texts of the third editions of the Versatile Video Coding (VVC, ISO/IEC 23090-3) and the Versatile Supplemental Enhancement Information (VSEI, ISO/IEC 23002-7) standards. The corresponding twin texts were also submitted to ITU-T SG 16 for consent as ITU-T H.266 and ITU-T H.274, respectively. New elements contained in VVC are the support of an unlimited level for the video profiles, as well as some technical corrections and editorial improvements on top of the second edition text of VVC. Furthermore, the VVC-specific support is specified for some supplemental enhancement information (SEI) messages that may be included in VVC bitstreams but are defined in external standards. These SEI messages include two systems-related SEI messages, (a) one for signalling of green metadata as specified in ISO/IEC 23001-11 and (b) the other for signalling of an alternative video decoding interface for immersive media as specified in ISO/IEC 23090-13. Furthermore, four other SEI messages are contained in the third edition of VSEI, namely (i) the shutter interval information SEI message, (ii) the neural network post-filter characteristics SEI message, (iii) the neural-network post-processing filter activation SEI message, and (iv) the phase indication SEI message.
While the shutter interval indication is already known from Advanced Video Coding (AVC) and High Efficiency Video Coding (HEVC), the new one on subsampling phase indication is relevant for variable-resolution streaming. The two SEI messages for describing and activating post-filters using neural network technology in video bitstreams could, for example, be used for reducing coding noise, spatial and temporal upsampling, colour improvement, or general denoising of the decoder output. The description of the neural network architecture itself is based on MPEG’s neural network representation standard (ISO/IEC 15938‑17). As results from an exploration experiment have shown, neural network-based post-filters can deliver better results than conventional filtering methods. Processes for invoking these new post-filters have already been tested in a software framework and will be made available in an upcoming version of the VVC reference software (ISO/IEC 23090-16).
MPEG reaches the First Milestone of AVC (11th Edition) and HEVC Amendment
At the 143rd MPEG meeting, the MPEG Joint Video Experts Team with ITU-T SG 16 (WG 5) issued the Committee Draft (CD) text of the eleventh edition of the Advanced Video Coding standard (AVC, ISO/IEC 14496-10) and the Committee Draft Amendment (CDAM) text for extension of the High Efficiency Video Coding standard (HEVC, ISO/IEC 23008-2). Both add specific supports for three new supplemental enhancement information (SEI) messages from the third edition of Versatile Supplemental Enhancement Information (VSEI), namely (i) the subsampling phase indication SEI message, (ii) the neural network postfilter characteristics SEI message, (iii) and the neural-network post-processing filter activation SEI message, so these can be included in AVC and HEVC bitstreams. Furthermore, code point identifiers for YCgCo-R colour representation with equal luma and chroma bit depths, and for a colour representation referred to as IPT-PQ-C2 (from the upcoming SMPTE ST 2128 specification) are added. The new edition of AVC also contains some technical corrections and editorial improvements on top of the 10th edition text, and the HEVC amendment specifies additional profiles supporting multiview applications, namely a 10-bit multiview profile, as well as 8-bit, 10-bit, and 12-bit monochrome multiview profiles, which could be beneficial for coding depth maps as auxiliary pictures.
MPEG Genomic Coding extended to support Joint Structured Storage and Transport of Sequencing Data, Annotation Data, and Metadata
At the 143rd MPEG meeting, MPEG Genomic Coding (WG 6) extended the support of file format and Application Programming Interfaces (APIs) to include annotations based on the analysis of DNA sequencing data results. DNA sequencing technologies produce extremely large amounts of heterogeneous data, including raw sequence reads, analysis results, annotations, and associated metadata, which are stored in different repositories worldwide. The use of this data needs to be enabled by a standardized and interoperable format. To enable new advanced features and applications, genomic data needs to be made available and accessible. Structuring and compression of such genomic data is required to reduce the storage size, increase transmission speed, and improve interoperable browsing and searching performance of these large data sets as required by a wide variety of applications and use cases.
ISO/IEC 23092-1 (3rd edition) – Transport and storage of genomic information – and ISO/IEC 23092-3 (3rd edition) – Metadata and application programming interfaces (APIs), supporting a joint coding of sequencing and annotation data, have been promoted to Draft International Standard (DIS) and Committee Draft (CD), respectively. The current MPEG-G standard series (ISO/IEC 23092) can now support full application pipelines, covering data representation and compression from the output of the sequencing up to the results of tertiary analysis support in a single structured file format with standard APIs and metadata as well as standard browsing and searching capabilities.
MPEG completes Reference Software and Conformance for
Geometry-based Point Cloud Compression
At the 143rd MPEG meeting, MPEG Coding of 3D Graphics and Haptics (WG 7) promoted two Geometry-based Point Cloud Compression (G‑PCC) related standards to the Final Draft International Standard (FDIS) stage: (i) the Reference Software (ISO/IEC 23090-21) and (ii) Conformance (ISO/IEC 23090-21). These standards facilitate the deployment of G-PCC, providing the source code of the encoder and decoder as well as bitstreams to validate the conformance of decoder implementations.
G‑PCC addresses lossless and lossy coding of time-varying 3D point clouds with associated attributes such as colour and material properties. This technology is particularly suitable for sparse point clouds. The generalized approach of G-PCC, where the 3D geometry is directly coded to exploit any redundancy in the point cloud itself, is particularly useful for sparse point clouds representing large environments.
Point clouds are typically represented by extremely large amounts of data, which is a significant barrier to mass market applications. The relative ease of capturing and rendering spatial information compared to other volumetric video representations makes point clouds increasingly popular for displaying immersive volumetric data. The current reference software implementation of a lossless, intra-frame G‑PCC encoder provides a compression ratio of up to 10:1 and lossy coding of acceptable quality for various applications with a ratio of up to 35:1.
By providing high immersion at currently available bit rates, the G-PCC-related standards will enable various applications such as 3D mapping, indoor navigation, autonomous driving, advanced augmented reality (AR) with environmental mapping, and cultural heritage.