2018-05-18, 17:27
HDR10 Metadata Explained
Note: Support for basic metadata is part of the HDR10 specification to allow displays with less peak brightness and color volume than the mastering display to adjust its tone mapping curve and gamut mapping to optimize the presentation of each HDR video.
Below are direct quotes from this source
HDR 10 Media Profile - The Consumer Technologies Association (CTA)’s official HDR video standard for use in HDR Televisions. HDR 10 requires the use of the SMPTE ST.2084 EOTF, BT.2020 color space, 10 bits per channel, 4.2.0 chroma subsampling, and the inclusion of SMPTE ST.2086 and associated MaxCLL and MaxFALL metadata values.
HDR 10 Media Profile defines the signal televisions can decode for the inclusion of “HDR compatibility” term in the marketing of televisions.
Note that “HDR compatibility” does not necessarily define the ability to display in the higher dynamic range, simply to the compatibility to decode and renormalize footage in the HDR 10 specification for whatever the dynamic range and color space of the display happen to be.
PQ - Perceptual Quantization - Name of the EOTF curve developed by Dolby and standardized in SMPTE ST.2084, designed to allocate bits as efficiently as possible with respect to how the human vision perceives changes in light levels.
Dolby’s tests established the Barten Threshold (also called the Barten Limit or the Barten Ramp), the point at what the difference in light levels between two values does that difference become visible.
nit - A unit of brightness density, or luminance. It’s the colloquial term for the SI units of candelas per square meter (1 nit = 1 cd/m2). It directly converts with the United States customary unit of foot-lamberts (1 fl = 1 cd/foot2), with 1 fl = 3.426 nits = 3.426 cd/m2.
Note that the peak nits / foot-lamberts value of a projector is often lower than that of a display, even in HDR video: because a projected image covers more area and the image is viewed in a darker environment than consumer’s homes, the same psychological and physiological responses exist at lower light levels.
For instance, a typical digital cinema screen will have a maximum brightness of 14fl or 48 cd/m2 vs. the display average of 80-120nits for reference and 300 for LCDs and Plasmas in the home. HDR cinema actual light output ranges in theaters are adjusted accordingly, since 1000 cd/m2 on a theater’s 30 foot screen is perceived to be far brighter than on a 65” flat screen.
MaxCLL Metadata - Maximum Content Light Level - An integer metadata value defining the maximum light level, in nits, of any single pixel within an encoded HDR video stream or file. MaxCLL should be measured during or after mastering. However, if you keep your color grade within the MaxCLL of your display’s HDR range, and add a hard clip for the light levels beyond your display’s maximum value, you can use your display’s maximum CLL as your metadata MaxCLL value.
MaxFALL Metadata - Maximum Frame Average Light Level - An integer metadata value defining the maximum average light level, in nits, for any single frame within an encoded HDR video stream or file. MaxFALL is calculated by averaging the decoded brightness values of all pixels within each frame (that is, converting the digital value of each frame into its corresponding nits value, and averaging all of the nits values within each frame).
MaxFALL is an important value to consider in mastering and color grading, and is usually lower than the MaxCLL value. The two values combined define how bright any individual pixel within a frame can be, and how bright the frame as a whole can be.
Displays are limited differently on both of those values, though typically only the peak (single pixel) brightness of a display is reported. As pixels get brighter and approach their peak output, they draw more power and heat up. With current technology levels, no display can push all of its pixels into the maximum HDR brightness level at the same time - the power draw would be extremely high, and the heat generated would severely damage the display.
As a result, displays will abruptly notch down the overall image brightness when the frame average brightness exceeds the rated MaxFALL, to keep the image under the safe average brightness level, regardless of what the peak brightness of the display or encoded image stream may be.
For example, while the BVM-X300 has a peak value of 1000 nits for any given pixel (MaxCLL = 1000), on average, the frame brightness cannot exceed about 180 nits (MaxFALL = 180). The MaxCLL and MaxFALL metadata included in the HDR 10 media profile allows consumer displays to adjust the entire stream’s brightness to match their own display limits.
SMPTE ST.2086 Metadata - Metadata Information about the display used to grade the HDR content. SMPTE ST.2086 includes information on six values: the three RGB primaries used, the white point used, and the display maximum and minimum light levels.
The RGB primaries and the white point values are recorded as ½ of their (X,Y) values from the CIE XYZ 1931 chromaticity standard, and expressed as the integer portion of the the first five significant digits, without a decimal place. Or, in other words:
f(XPrimary) = 100,000 × XPrimary ÷ 2
f(YPrimary) = 100,000 × YPrimary ÷ 2.
For example, the (X,Y) value of DCI-P3’s ‘red’ primary is (0.68, 0.32) in CIE XYZ; in SMPTE ST.2086 terms it’s recorded as
R(34000,16000)
because
for R(0.68,0.32):
f(XR) = 100,000 × 0.68 ÷ 2 = 34,000
f(YR) = 100,000 × 0.32 ÷ 2 = 16,000
Maximum and minimum luminance values are recorded as nits × 10,000, so that they too end up as positive integers. For instance, a display like the Sony BVM-X300 with a range from 0.0001 to 1000 nits would record its luminance as
L(10000000,1)
The full ST.2086 Metadata is ordered Green, Blue, Red, White Point, Luminance with the values as
G(XG,YG)B(XB,YB)R(XR,YR)WP(XWP,YWP)L(max,min)
all strung together, and without spaces. For instance, the ST.2086 for a DCI-P3 display with a maximum luminance of 1000 nits, a minimum of 0.0001 nit would be, and using white point D65 would be:
G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,1)
while a display like the Sony BVM-X300, using BT.2020 primaries, with a white point of D65 and the same max and min brightness would be:
G(8500,39850)B(6550,2300)R(35400,14600)WP(15635,16450)L(10000000,1)
In an ideal situation, it would be best to use a colorimeter and measure the display’s native R-G-B and white point values; however, in all practicality the RGB and white point values the display conforms to that was used in mastering, are sufficient in communicating information about the mastery to the end unit display.
Example MediaInfo from The Prestige (2006):
Text View (Download MediaInfo utility here)
Video
ID: 1
Format: HEVC
Format/Info: High Efficiency Video Coding
Commercial name: HDR10
Format profile: Main [email protected] High
Codec ID: V_MPEGH/ISO/HEVC
Duration: 2 h 10 min
Bit rate: 54.8 Mb/s
Width: 3 840 pixels
Height: 2 160 pixels
Display aspect ratio: 16:9
Frame rate mode: Constant
Frame rate: 23.976 (24000/1001) FPS
Color space: YUV
Chroma subsampling: 4:2:0 (Type 2)
Bit depth: 10 bits
Bits/(Pixel*Frame): 0.276
Stream size: 49.9 GiB
Title: The.Prestige.2006
Writing library: ATEME Titan File 3.8.3 (4.8.3.0)
Default: Yes
Forced: No
Color range: Limited
Color primaries: BT.2020
Transfer characteristics: PQ
Matrix coefficients: BT.2020 non-constant
Mastering display color primaries: Display P3
Mastering display luminance: min: 0.0050 cd/m2, max: 4000 cd/m2
Maximum Content Light Level: 1121 cd/m2
Maximum Frame-Average Light Level: 284 cd/m2
Note: Some videos set both MaxCLL and MaxFALL to 0,0, relying on the default values of the mastering display.
Note: Support for basic metadata is part of the HDR10 specification to allow displays with less peak brightness and color volume than the mastering display to adjust its tone mapping curve and gamut mapping to optimize the presentation of each HDR video.
Below are direct quotes from this source
HDR 10 Media Profile - The Consumer Technologies Association (CTA)’s official HDR video standard for use in HDR Televisions. HDR 10 requires the use of the SMPTE ST.2084 EOTF, BT.2020 color space, 10 bits per channel, 4.2.0 chroma subsampling, and the inclusion of SMPTE ST.2086 and associated MaxCLL and MaxFALL metadata values.
HDR 10 Media Profile defines the signal televisions can decode for the inclusion of “HDR compatibility” term in the marketing of televisions.
Note that “HDR compatibility” does not necessarily define the ability to display in the higher dynamic range, simply to the compatibility to decode and renormalize footage in the HDR 10 specification for whatever the dynamic range and color space of the display happen to be.
PQ - Perceptual Quantization - Name of the EOTF curve developed by Dolby and standardized in SMPTE ST.2084, designed to allocate bits as efficiently as possible with respect to how the human vision perceives changes in light levels.
Dolby’s tests established the Barten Threshold (also called the Barten Limit or the Barten Ramp), the point at what the difference in light levels between two values does that difference become visible.
nit - A unit of brightness density, or luminance. It’s the colloquial term for the SI units of candelas per square meter (1 nit = 1 cd/m2). It directly converts with the United States customary unit of foot-lamberts (1 fl = 1 cd/foot2), with 1 fl = 3.426 nits = 3.426 cd/m2.
Note that the peak nits / foot-lamberts value of a projector is often lower than that of a display, even in HDR video: because a projected image covers more area and the image is viewed in a darker environment than consumer’s homes, the same psychological and physiological responses exist at lower light levels.
For instance, a typical digital cinema screen will have a maximum brightness of 14fl or 48 cd/m2 vs. the display average of 80-120nits for reference and 300 for LCDs and Plasmas in the home. HDR cinema actual light output ranges in theaters are adjusted accordingly, since 1000 cd/m2 on a theater’s 30 foot screen is perceived to be far brighter than on a 65” flat screen.
MaxCLL Metadata - Maximum Content Light Level - An integer metadata value defining the maximum light level, in nits, of any single pixel within an encoded HDR video stream or file. MaxCLL should be measured during or after mastering. However, if you keep your color grade within the MaxCLL of your display’s HDR range, and add a hard clip for the light levels beyond your display’s maximum value, you can use your display’s maximum CLL as your metadata MaxCLL value.
MaxFALL Metadata - Maximum Frame Average Light Level - An integer metadata value defining the maximum average light level, in nits, for any single frame within an encoded HDR video stream or file. MaxFALL is calculated by averaging the decoded brightness values of all pixels within each frame (that is, converting the digital value of each frame into its corresponding nits value, and averaging all of the nits values within each frame).
MaxFALL is an important value to consider in mastering and color grading, and is usually lower than the MaxCLL value. The two values combined define how bright any individual pixel within a frame can be, and how bright the frame as a whole can be.
Displays are limited differently on both of those values, though typically only the peak (single pixel) brightness of a display is reported. As pixels get brighter and approach their peak output, they draw more power and heat up. With current technology levels, no display can push all of its pixels into the maximum HDR brightness level at the same time - the power draw would be extremely high, and the heat generated would severely damage the display.
As a result, displays will abruptly notch down the overall image brightness when the frame average brightness exceeds the rated MaxFALL, to keep the image under the safe average brightness level, regardless of what the peak brightness of the display or encoded image stream may be.
For example, while the BVM-X300 has a peak value of 1000 nits for any given pixel (MaxCLL = 1000), on average, the frame brightness cannot exceed about 180 nits (MaxFALL = 180). The MaxCLL and MaxFALL metadata included in the HDR 10 media profile allows consumer displays to adjust the entire stream’s brightness to match their own display limits.
SMPTE ST.2086 Metadata - Metadata Information about the display used to grade the HDR content. SMPTE ST.2086 includes information on six values: the three RGB primaries used, the white point used, and the display maximum and minimum light levels.
The RGB primaries and the white point values are recorded as ½ of their (X,Y) values from the CIE XYZ 1931 chromaticity standard, and expressed as the integer portion of the the first five significant digits, without a decimal place. Or, in other words:
f(XPrimary) = 100,000 × XPrimary ÷ 2
f(YPrimary) = 100,000 × YPrimary ÷ 2.
For example, the (X,Y) value of DCI-P3’s ‘red’ primary is (0.68, 0.32) in CIE XYZ; in SMPTE ST.2086 terms it’s recorded as
R(34000,16000)
because
for R(0.68,0.32):
f(XR) = 100,000 × 0.68 ÷ 2 = 34,000
f(YR) = 100,000 × 0.32 ÷ 2 = 16,000
Maximum and minimum luminance values are recorded as nits × 10,000, so that they too end up as positive integers. For instance, a display like the Sony BVM-X300 with a range from 0.0001 to 1000 nits would record its luminance as
L(10000000,1)
The full ST.2086 Metadata is ordered Green, Blue, Red, White Point, Luminance with the values as
G(XG,YG)B(XB,YB)R(XR,YR)WP(XWP,YWP)L(max,min)
all strung together, and without spaces. For instance, the ST.2086 for a DCI-P3 display with a maximum luminance of 1000 nits, a minimum of 0.0001 nit would be, and using white point D65 would be:
G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,1)
while a display like the Sony BVM-X300, using BT.2020 primaries, with a white point of D65 and the same max and min brightness would be:
G(8500,39850)B(6550,2300)R(35400,14600)WP(15635,16450)L(10000000,1)
In an ideal situation, it would be best to use a colorimeter and measure the display’s native R-G-B and white point values; however, in all practicality the RGB and white point values the display conforms to that was used in mastering, are sufficient in communicating information about the mastery to the end unit display.
Example MediaInfo from The Prestige (2006):
Text View (Download MediaInfo utility here)
Video
ID: 1
Format: HEVC
Format/Info: High Efficiency Video Coding
Commercial name: HDR10
Format profile: Main [email protected] High
Codec ID: V_MPEGH/ISO/HEVC
Duration: 2 h 10 min
Bit rate: 54.8 Mb/s
Width: 3 840 pixels
Height: 2 160 pixels
Display aspect ratio: 16:9
Frame rate mode: Constant
Frame rate: 23.976 (24000/1001) FPS
Color space: YUV
Chroma subsampling: 4:2:0 (Type 2)
Bit depth: 10 bits
Bits/(Pixel*Frame): 0.276
Stream size: 49.9 GiB
Title: The.Prestige.2006
Writing library: ATEME Titan File 3.8.3 (4.8.3.0)
Default: Yes
Forced: No
Color range: Limited
Color primaries: BT.2020
Transfer characteristics: PQ
Matrix coefficients: BT.2020 non-constant
Mastering display color primaries: Display P3
Mastering display luminance: min: 0.0050 cd/m2, max: 4000 cd/m2
Maximum Content Light Level: 1121 cd/m2
Maximum Frame-Average Light Level: 284 cd/m2
Note: Some videos set both MaxCLL and MaxFALL to 0,0, relying on the default values of the mastering display.