owned this note
owned this note
Published
Linked with GitHub
# 2024 Linux Display Next Hackfest
## May 14th
### Frame timing and VRR
#### Limitations of uAPI
- Some displays have different brightness levels at different refresh rates
- Abrubt changes in refresh rate → abrubt change in brightness
- Flickers, headaches 😥
- Workarounds to reduce
#### Current user space solutions
- Simple solution: restrict adaptive sync to apps that have a ~fixed update rate
- Doesn't always work out (games fluctuate)
- Some displays blank when toggling VRR
- Other solution: try to smoothe refresh rate changes
- No immedate jump, slope adjusting slowly
- Works, mostly… brightness change still visible though
- Scheduling is Hard™, sometimes wake-up is too late for thread
- Low frame rate compensation (LFC): driver doubles refresh rate if it's too low → flicker again
#### Brainstorm better uAPI
- New KMS APIs please?
- Some hardware support this "smoothing" of the framerate natively (intel)
- Min/max props? Each commit with target refresh rate/time?
- Kernel in the ~same spot as userspace if it doesn't have native hw support
- KMS API to amend last commit, mailbox semantics?
- API to disable kernel-side LFC so that userspace can do its own stuff
- Some panels don't have EDIDs
- Could expose KMS props for some EDID fields
- But it's duplicating sources of truths and codepaths
- Panel drivers can hardcode an EDID
#### Consensus
- We want some min/max VRR API and KMS guarantees a rate in-between
- We want some KMS API to disable LFC
#### Chat log:
*[12:20 Pekka Paalanen]*
Mario Kleiner's scientific software needs to know if a frame was displayed or discarded.
*[12:23 Dmitry Baryshkov]*
How 'skipping the frame' is going to work with DSI CMD panels or with any other deferred IO use case?
*[12:24 Georges Stavracas]*
Thanks for pasting the question Dmitry. You can try logging out and in again, sometimes Jitsi does this
*[12:29 Jonas Ådahl]*
A database in libdisplay-info perhaps..
*[12:29 Pekka Paalanen]*
Framerate slope limiting is indeed a workaround for "broken" monitor hardware.
*[12:29 Simon Ser]*
Unfortunately it doesn't seem like broken hw is going away anytime soon, even with newer hw
*[12:33 Pekka Paalanen]*
Sure, hardware quirk databases already exist and are here to stay. Now we need one for monitors. Stutter could be caused by any number of reasons. Brightness flickering seems to be unique for broken VRR monitors. I'd suggest defaulting to brightness flicker, then it is easy to say how to fix it in settings. for unlisted monitors
*[12:35 Simon Ser]*
ultimately up to the compositors to decide
*[12:37 Pekka Paalanen]*
We can always recommend best practices.
*[12:39 Dmitry Baryshkov]*
Generating fake EDID is a strange idea.
*[12:39 Christopher Braga]*
Yeah, I am not a fan of that
*[12:40 Michel Dänzer]*
unfortunately, the majority of VRR monitors so far seem to exhibit brightness flicker, even the majority of those with VESA certification (which on paper is supposed to rule out flicker)
*[12:40 Dmitry Baryshkov]*
What about the panels that might have EDID, but ended up wit having no DDC pins connected?
*[12:40 Pekka Paalanen]*
EDID is the UAPI, so generating an EDID in the kernel is good idea. It won't explode the UAPI.
*[12:41 Dmitry Baryshkov]*
We end up having two possible EDIDs for such case
*[12:42 Uday Kiran]*
for MIPI panels, when it is not advertising EDID, how can kernel driver will get to know its VRR range ?
*[12:42 Dmitry Baryshkov]*
size is already a part of the connector info.
*[12:42 Pekka Paalanen]*
Dmitry, how two? One that cannot be accessed and a fake one?
*[12:43 Dmitry Baryshkov]*
Pekka, the same panel. In one device it gets DDC pins connected and the userspace gets original EDID. On other device the DDC pins are not connected, so userspace gets another (fake) EDID
*[12:46 Marijn Suijten]*
Having written quite a few panel drivers, downstream MSM kernels hardcode EDID-like parameters in DTS but most of that is converted to a C driver, where such information would have to be passed (such as currently the phyiscal dimensions)
*[12:47 Dmitry Baryshkov]*
Then basically we are to force 'panels should have a EDID' into the drm_panel API
*[12:47 Pekka Paalanen]*
The alternative of exposing everything EDID as KMS UAPI would be massive.
*[12:48 Dmitry Baryshkov]*
Do we need to export everything?
*[12:48 Simon Ser]*
there'll always be "that one more thing"
*[12:48 Pekka Paalanen]* not everything, but even just colorimetry is a lot
*[12:48 Dmitry Baryshkov]*
Then basically we are to force 'panels should have a EDID' into the drm_panel API
*[12:54 Pekka Paalanen]*
I think it's related to strobed backlight. Average brightness is the integral over the backlight on and off periods, an on-state power.
*[12:56 Michel Dänzer]*
AFAIK the flicker isn't limited to strobed backlight (I understand VRR monitors with strobed backlight are rare so far)
*[12:57 Pekka Paalanen]*
What's the mechanism causing flicker, if the backlight is continuously on?
*[12:57 Michel Dänzer]*
not sure offhand
*[12:57 Simon Ser]*
btw, https://github.com/swaywm/sway/wiki/VRR-setups, small hwdb
---
### KMS Color/HDR (Kernel)
#### Brief status-update
- AMD:
- [[RFC PATCH v4 00/42] Color Pipeline API w/ VKMS](https://lore.kernel.org/dri-devel/20240226211100.100108-1-harry.wentland@amd.com/)
- NVIDIA:
- Presented a diagram with their plane color pipeline. (link to slides?)
- First step, vendor-specific properties
- Hit some issues
- Intel:
- [[PATCH 00/28] Plane Color Pipeline support for Intel platforms](https://lore.kernel.org/dri-devel/20240213064835.139464-1-uma.shankar@intel.com/)
- Align on the new uAPI, adopted it and implemented it
- Added a few enhancements, some accepted already some not
- No major concerns
- Want more hw caps to be exposed
- Qualcomm: ENOTIME
- Arm: still TODO
#### Current needs
- Consensus, review
- IGT
- Kernel part is well advanced and compositors should start to work on colorop API for validation.
- User-space (gamescope is a good candidate)
- Simon worked on a draft on gamescope using the generic API and shared during the hackfest: https://github.com/ValveSoftware/gamescope/pull/1309 (missing 3D LUT support from kernel)
- Melissa is hacking the `drm_info` to map the new colorop API and will share when she has something useful.
#### Discussions around the proposal with new DRM object type
- AMD: Scaling and YUV
- Scaling happens in different places depending on whether the buffer has a YUV format, how to describe this?
- Perhaps with a CSC for the YUV->RGB part
- A CSC doesn't work for various YUV formats
- Could start with CSC, add new color blocks on demand for formats that can't be handled with a CSC
- Do color pipelines impose restrictions on fb formats/modifiers?
- NVIDIA
- Some LUTs require modesets to change the values (but can be toggled on/off)
- Tearing if updated while flipping
- PQ
- Tone-mapping color op only maps the first channel (which is Intensity in ICtCp)
- LUT use a complicated variable step size stuff, very involved to expose all details to user-space
- Enumerated TFs + combined LUT/multiplier in software for now
- LUT can only be enabled if the input buffer isn't FP16
- Reject atomic test? Tag pipeline with new uAPI? (e.g. list of formats, without modifiers)
- Tone mapping LUT only maps a single component
- Intel
- Precision of the pipeline
- LUT elements are uint16 right now
- hw can do better, want to allow user-space to submit uint24
- Type of interpolation
- Distribution of LUT elements in hw
- Practicalities
- Something to keep in mind is that all introduced uAPIs need tests and users in userspace so an initial proposal needs to limit its scope to make this achievable. An MVP more or less.
- A place for HW color blocks to be described as reference for compositors.
- GLSL for the blocks would be a nice bonus
- Add examples of common/simple color pipelines with each vendor hw
- We need something like IN_FORMATS to pipelines to allow userspace to ignore using pipelines
- Perhaps flags on ops to tell whether they can be programmed without a modeset
- Flag to mark individual LUT channels as read-only, or a 1DLUT with a single component
#### Chat Log:
*[15:03 Pekka Paalanen]*
Pre-blending is keen on accuracy and full definition of colorops. Post-blending may not be, as long as the operation is not changed. So there is room post-blending for secret hardware sauce. The compositors' desire to switch between GPU and KMS composition does not suffer from post-blending operations.
*[15:04 Simon Ser]*
good point
*[15:10 Uma Shankar]*
its same for Intel as well. YUV is converted to RGB before feeding to plane color blocks
*[15:10 Pekka Paalanen]*
It seems like a generic matrix-curve-matrix sequence can deal with all YUV-to-RGB conversions, including constant luminance and ICtCp, if we can recover full resolution U and V first.
*[15:20 Pekka Paalanen]*
So do we need a pipeline to list all pixel formats it can accept?
*[15:20 Christopher Braga]*
Yeah, I believe we do. At least from the Qualcomm side this is what I would like to do 😃
*[15:21 Pekka Paalanen]*
Works for me (Weston) I think
*[15:21 Victoria Brekenfeld]*
That can be a latency issue
*[15:22 Pekka Paalanen]*
How latency?
*[15:21 Victoria Brekenfeld]*
Testing and uploading blobs takes time. At some point I would have to fallback to shaders to not miss the vblank.
*[15:22 Christopher Braga]*
The capabilities of each color pipeline would be enumerated at boot though. Once you know what pixel formats it is intended for there should be no guess work
*[15:22 Pekka Paalanen]*
Knowing in advance which pixel formats a pipeline supports would lead to that? I think it would reduce the need of testing.
*[15:22 Victoria Brekenfeld]*
Right. Something like requiring IN_FORMATS for the first colorop would be fine.
*[15:23 Pekka Paalanen]*
Sounds good. Modifiers or not?
*[15:23 Victoria Brekenfeld]*
Depends on if modifiers can be an issue? right now it seems like vendors don't care about that, just the format?
*[15:23 Jonas Ådahl]*
I think we need a bit of both: TEST_ONLY on a set of potential pipelines, and a way to filter out what pipelines to not try. some filter will be that they can't be programmed according to compositors need, and having a list of valid formats would help shrink the list to brute force
*[15:34 Simon Ser]*
Right, it's a balance between the two approaches. I don't think we want a declarative uAPI for every weird hw limitation, not exactly IN_FORMATS since no modifiers eheh
*[15:34 Dor Askayo]*
I'm completely clueless on the topic and haven't taken part in previous discussions, but why not do something similar to buffer modifiers where you have what is essentially a number which describes the properties of a color pipeline and then have an implementation for each unique pipeline in user-space in a shared library that all compositors use?
*[15:35 Victoria Brekenfeld]*
sounds hopeful
*[15:38 Jonas Ådahl]*
@Dor: It's hard to describe a pipeline with a number, because there will as many combinations as there are hardware, and different combinations for the same hardware.
If a new generation of hardware or driver wants to expose a new type of pipeline, it'd be awkward to have to define a "modifier" that describes it, instead of just using the color op list the pipeline defines.
*[15:38 Victoria Brekenfeld]*
Also please no basically forced shared library.
*[15:39 Simon Ser]*
ty for fixing my stuff alex
*[15:40 Pekka Paalanen]*
Dor, such library would need to offer shader snippets in all possible shader languages and variants. It may be a hard sell to compositors. If the pipelines are completely rigid, they would prevent innovation. Innovation is important, because while there are standards for some color conversions and mappings, they are more opinions rather than final solutions. Everyone do not share even the same goals.
*[15:41 Dor Askayo]*
@Jonas, the number of combinations may not really be an issue because you only need to embed aspects that are different between different pipelines. In this talk I only heard of 5-6 possible combinations, really.
@Victoria, you can always maintain an alternative implementation in Rust 😃
*[15:42 Jonas Ådahl]*
@Dor I wouldn't be surprised you can get 5 different effective pipelines from one AMD hw generation alone by enabling/disabling/moving color ops in different ways
*[15:42 Chaitanya Borah]*
https://github.com/ckborah/drm-tip-sandbox/commit/86441c236c9e16a430b629f0a278b444ec1960c8#diff-dab99b769a18136804a5e52ebb50648438500b97f6c42fcdb69fa0668bffff88R3814
*[15:45 Victoria Brekenfeld]*
"minimum viable product"
*[15:46 Dor Askayo]*
@Jonas, I may be wrong, but I'm assuming that you'll need to handle those differences in user space anyway and my idea is just a way to communicate properties as uAPI in a less explicit and mroe extensible way.
Dor Askayo says:more*
15:46
*[15:48 Chaitanya Borah]*
Representing AMD 1D LUTs as single segment 1D LUT:
https://github.com/ckborah/drm-tip-sandbox/commit/1decdcb5d88eb86d46fe75d97c5c51d660976530
*[15:48 Melissa Wen]*
10 min to the end of this session
*[15:49 Dor Askayo]*
The more hardware differences that can be handled in kernel transparently to user-space the better, of course. I'm only suggesting how to communicate properties that user-space must be aware of and handle specifically to get HDR working correctly.
The goal is to avoid the need to brute-force pipelines in TEST_ONLY to no end.
1
*[15:58 Pekka Paalanen]*
If people want to use my color-and-hdr repo, I welcome more maintainers. 😉
Weston does tear apart ICC profiles, FWIW.
---
### KMS Color/HDR (Use-cases)
#### HDR gainmap images and how we should think of HDR
Presentation: https://docs.google.com/presentation/d/1jLloTJ9IX2eJwE07DQDyOCdrSgCbbPYy141lMJpd5uA
#### Color/HDR testing/CI
- VKMS status-update and future plans
- New/upcoming: YUV support, configfs, VRR, DP-MST emulation
- Harry prototype (faster development than real driver, in a VM)
- Good idea to implement colorop blocks in VKMS for reference
- Lacking reviewers
- Chamelium boards, video capture
- 10 bit support? Probably possible (maybe with software patches)
#### Action points
- We don't need more kernel work around colorop/generic color API, now it's time to get some proposals/use-cases from compositors to validate the API actually match their needs and is useful. So maintainers will be more confident to accept the API: IGT tests and an userspace case.
- WIP branches for testing things out:
- https://gitlab.freedesktop.org/hwentland/linux/-/commits/color-pipeline-wip-1
- https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/commits/color-pipeline-wip-1
- Intel's WIP branch rebased on top of harry's upstreamed patches (v4)
- https://github.com/ckborah/drm-tip-sandbox/commits/cp-rebased-v4-dev
#### Chat Log
*[16:22 Christopher Cameron]*
https://docs.google.com/presentation/d/1jLloTJ9IX2eJwE07DQDyOCdrSgCbbPYy141lMJpd5uA
*[16:23 Pekka Paalanen]*
Gainmaps in KMS? 😄
*[16:24 Uday Kiran]*
Thanks Melissa
*[16:31 Simon Ser]*
So HDR nothing to do with gamut and color spaces?
*[16:32 Pekka Paalanen]*
Only in technical specifications it does
*[16:33 Victoria Brekenfeld]*
"without sacrificing quality" says something about gamuts, I guess.
*[16:34 Pekka Paalanen]*
Dynamic range and color gamut are technically orthogonal, at least if you ask a camera or a colorimeter. Humans tend to disagree.
*[16:34 Simon Ser]*
So i can just do HDR with sRGB?
*[16:34 Pekka Paalanen]*
Sure
*[16:34 Jonas Ådahl]*
Simon, the easiest way to emulate "HDR" is to increase the brightness of your monitor until it hearts
*[16:35 Simon Ser]*
Until it ♥. I see
*[16:35 Jonas Ådahl]*
Ah, hehe, I wonder how I could "typo" that bad
*[16:35 Pekka Paalanen]*
I loved this presentation!
*[16:46 Simon Ser]*
Yeah, thanks a lot!
*[16:55 Pekka Paalanen]*
Maybe one can though: top layer uses only alpha, and RGB=(0,0,0). Set alpha channel to 1.0 - factor and use some creative pre and post scaling.
*[16:56 Simon Ser]*
Aha. Probably not practical, but fun idea
*[17:04 Pekka Paalanen]*
Why match the black of the video in black bars instead of using just the ultimate black of the display? ..for the black bars
*[17:04 Simon Ser]*
Maybe so that a black video blends in without borders?
*[17:04 Pekka Paalanen]*
Should it blend like that?
*[17:05 Alex Goins]*
But in a pitch black room I can pretend that my OLED TV has a different aspect ratio
*[17:05 Pekka Paalanen]*
that ^
*[17:06 Simon Ser]*
Most of the time people aren't in these conditions though . I suppose it depends on the use-case.
*[17:07 David Turner]*
The background colour property would be useful regardless, our hardware can do it trivially and I assume most others can
*[17:07 Simon Ser]*
Yeah, for sure
*[17:08 Harry Wentland]*
Isn't there a KMS property for a single-color-value plane? Not sure if that got merged. That could be placed at the bottom zpos to support bg color.
*[17:08 Simon Ser]*
Still a proposal, not merged yet.
*[17:09 David Turner]*
That would work great for me and saves a special case. And then your letterbox colour is just a user option.
*[17:09 Melissa Wen]*
We also have a bkg color property proposal not merged because it was lacking userspace needs
*[17:09 Simon Ser]*
https://lore.kernel.org/dri-devel/20230728-solid-fill-v5-0-053dbefa909c@quicinc.com/
*[17:11 David Turner]*
Cool, I'm currently working on a patchset for scanout of letterboxed video, I'll see if I can use that to get some wonderfully tasteless bright blue borders.
*[17:15 Pekka Paalanen]*
I would be ready to take gainmaps into Wayland color-management protocol or its extension, or at least experiment with that.
---
## May 15
### Real time scheduling & async KMS API
- Delayed cursor moves
- Modesets block for a long time in the kernel
- Kernel has restrictions for how long a realtime thread can block
- Can kill compositors 🙀
#### Issue 1
- https://gitlab.gnome.org/GNOME/mutter/-/issues/3479
- When daisy-chain displays are being connected Gnome dies
- It crashes because a realtime thread takes too long
- Realtime thread could be:
- querying KMS (connectors, modes)
- flip
#### Issue 2
- pageflip takes too long
- drivers (AMD, NVidia) might be making too much stuff on a simple pageflip
#### Actions
- Report driver bugs, debug them further
- Document that NONBLOCK is fast
#### Commit deadline
- Cannot predict when is the last possible time to commit
- Can take a long time to program the hw → maybe split programming and commit?
- Commit needs to happen before vblank (AMD: before vactive)
- There is no feedback to user-space on the deadline so user-space has to guess with an arbitrary extra delay
- Would be nice to have feedback on how much time the driver took to program hw (similar to render job duration)
- Drivers can't tell in advance how much time is necessary, because it depends which props are modified (example: amdgpu takes long time to update color management stuff)
#### Desirable features from compositor PoV
- Want to have a timestamp for when programming the last submitted atomic commit is done
- Can implement with hw_done callback in DRM maybe?
- For weirder drivers where the deadline of the programming is much earlier: want a hint
- API to pre-program a color pipeline upfront before committing it
#### Tearing
- Driver checks need to be relaxed: IN_FENCE_FD, FB_DAMAGE_CLIPS, no-op prop changes on non-primary planes
- Some Intel modifiers can't do tearing
- New IN_FORMATS-like prop to list modifiers that can do tearing?
- We don't *need* that new prop: user-space can try buffers with different modifiers until it finds one that works. But that's expensive and doesn't allow compositors to give feedback to direct scanout clients
#### Chat log:
*[10:20 Unknown]*
daisy chain bug report in question:
https://gitlab.gnome.org/GNOME/mutter/-/issues/3479
*[10:56 Marijn Suijten]*
Maybe relevant context from Android: https://developer.android.com/ndk/reference/group/choreographer. Specifically AChoreographerFrameCallbackData_getFrameTimelineExpectedPresentationTimeNanos versus AChoreographerFrameCallbackData_getFrameTimelineDeadlineNanos
*[10:56 Pekka Paalanen]*
Is the discussion is about shaving sub-frame-period latencies even lower? Maybe underline that.
*[10:57 Jonas Ådahl]*
I and Xaver tried to, but maybe was not clear enough
*[11:01 Pekka Paalanen]*
We are talking about shaving e.g. 1-2 milliseconds from latency
*[11:01 Michel Dänzer]*
Yes (it can be more than that though)
*[11:02 Pekka Paalanen]*
Not in the order of full refresh period, it's the order of magnitude that seems confused here
*[11:02 Michel Dänzer]*
E.g. at 60 Hz refresh rate it's relatively easy to shave off ~10 ms
*[11:03 Pekka Paalanen]*
Yes, but maybe explaining about the order of magnitude will clear things up
*[11:13 Michel Dänzer]*
It's nothing to do with transmission over a display link
### Power Savings vs Color/Latency
- Talk: ABM (adaptive backlight management);
- <Link to slides?>
- Similar Intel technology: Intel DPST
- PSR1 latencies (seems to be ~200ms in practice, should be ~5 frames, maybe a bug somewhere?)
- Power optimization vs color accuracy/latency requirements: How do we let drivers optimize for (a significant amount of) power, while being able to conserve color accuracy or latency requirements of the compositor?
- New generic KMS prop to control power savings features?
- Agreement was to make a generic property that compositor could use, but in practice would only be used by AMD.
- Intel would need to expose a histogram into userspace and let compositor do this work.
- Generic property would be opt-in by the compositor
- When enabled the driver would immediately disable any power saving features and using sysfs files would return -EBUSY.
- Driver can cache old values
- If switching clients that don't support the features restore the previous values
#### Atomic check feedback
- Create some common reasons for failure (No bandwidth, no resource)
- With a reference to the property/connector
- Capability groups / "flavour" information
- Indicate which groups of planes are homogeneous / interchangeable (e.g. all overlay planes are identical).
- To cut down on the number of potential plane configurations which the compositor has to try.
#### Chat log
*[12:58 Unknown]*
Even the AMD cursor plane?
*[12:58 Harry Wentland]*
haha... cursor is awful
*[12:58 Pekka Paalanen]*
What about "capability group id" immutable number property on KMS planes? Every plane with a number is totally interchangeable with all other planes with the same number. If you test on one, and it fails, you can skip all planes with the same number.
*[13:06 David Turner]*
Missed your message, I think that's exactly what I meant with my "flavour" ID
*[13:13 Pekka Paalanen]*
Yup! DRM flight recorder... but that is a distribution/integrator setup, not compositor.
...maybe KMS clients should be able to record markers in DRM flight recorder stream... or maybe that already works via ftrace stuff, sans privileges?
*[13:15 Simon Ser]*
Pekka, https://gitlab.freedesktop.org/emersion/libliftoff/-/merge_requests/61
Why markers when you can just system("sudo dmesg -C"); XD
Since sudo is required anyways, could just echo something to dmesg
*[13:16 Pekka Paalanen]*
ouch 😃
*[13:16 Victoria Brekenfeld]*
Effectively we would reproduce androids hardware composer api (hwc2). Please no
### HDR & Color Management (userspace)
- Talk: Linux color handling and management.
- <Link to slides?>
- APIs provided by Wayland and how they can be used to achieve better color management for applications;
#### Color management
- Ref: [color-management protocol status-update]( https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/14)
- Should clients have an explicit "do not tonemap me" flag or should this be indicated by them just setting their color metadata to match that of the output?
- More complicated on desktops with multiple outputs
- ICC is (really, really) hard (but also useful)
- It describes how to convert between the specified color-space, which is a really roundabout way to describe the actual color-space
- For creation workflows, apply ICC profile per output
- ICC correction on output can be done without the color protocol, but this doesn't allow clients to skip the compositor color correction if they know about colors.
- Maybe instead of adding ICC support to wayland color protocols, we should just make the client deal with ICC.
- Less ideal for clients who just want to pass through exactly what they want
- But keeps the protocol simpler.
- Non-trivial to achieve comparable feature set as one cannot just extract data from ICC profiles and calculating e.g. XYZ throws away information part of the ICC profile which might not create good enough results.
- Can ICC support be left for e.g. a v2 of the protocol, especially if initially only weston will implement support for it?
- How one does tone mapping affects blending results; something to keep in mind when defining how compositor blending works
#### Color representation
- Ref: [color-representation and video playback](https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/183)
- Has anyone actually implemented it?
- A few skeleton implementations in compositors
- Needs open coded shader implementations for YUV to hook up, when using OpenGL
- A prototype in gstreamer
- What's missing
- Enum values, instead of ICC code points. So clients with standard video content don't have to know about color in detail.
- White point?
#### Chat log:
*[14:55 Unknown]*
Any information on LCMS usage w.r.t color profile in weston ?
*[14:55 Pekka Paalanen]*
Weston uses LittleCMS with ICC profiles "as is". The code for parametric image descriptions is to be written, but it's in the near term plans.
*[14:56 Marti Maria (LittleCMS)]*
If you have any question on the color engine, I will be glad to answer (as I'm the author 😃 )
*[14:57 Vikas K]*
Current implementation of LCMS is not supporting HDR curves, what are the plans to enable that support
*[14:58 Uday Kiran]*
Nice to meet you Matri Maria. Yes, we do have some questions related to the support of color spaces
*[14:58 Pekka Paalanen]*
My plan is to convert parametric image descriptions into case-by-case cmsPipeline, let LittleCMS arrange the complete pipeline from source and destination image descriptions, and then the result is converted into Weston's internal color pipeline representation.
*[14:58 Marti Maria (LittleCMS)]*
It actually supports them across DToB tags and unbounded mode. Its a matter of using the right profiles and floating point format.
*[15:00 Pekka Paalanen]*
IOW, I will never create a proper ICC profile from a parametric image description, but I create a cmsPipeline with some custom elements added that does what I need, e.g. tone mapping.
*[15:00 Uday Kiran]*
But the cmsPipeline does not support all the colorspaces like DCI-P3 if the monitor supports only till DCI-P3 CS
*[15:00 Pekka Paalanen]*
Uday, it does not need to. It's just math, and LCMS has much of the math operations already.
*[15:01 Marti Maria (LittleCMS)]*
cmsPipeline supports whatever. Is just a chain of math operations
*[15:01 Vikas K]*
Could you please share some reference implementation for using LCMS for HDR use cases.
*[15:03 Marti Maria (LittleCMS)]*
It is not a big deal, just chain two suitable profiles and feed the transform with floating point. If profiles can work in unbounded mode values over 1.0 would be converted as well.
*[15:04 Pekka Paalanen]*
In Weston we chose to use LittleCMS as the core framework for creating color pipelines because we want to support real ICC profiles, and did not want to reinvent ICC file handling.
Vikas, I can't share anything yet, because it hasn't been written for Weston yet, but it will be.
*[15:07 Vikas K]*
We could use Weston color pipeline and create profile for input surface and do conversions like BT709 to BT2020, i could not find documentation for ST2084 so i assumed it is not being supported in LCMS. I will check based on Marti Suggestion for HDR.
*[15:08 Pekka Paalanen]*
Vikas, you mean the PQ curve?
*[15:08 Simon Ser]*
Does LCMS optimize pipelines in some way?
*[15:08 Vikas K]*
Pekka, Its ok even if it is not in Weston reference. if you have something helpful please share.
*[15:08 Simon Ser]*
Or just applies each step one by one?
*[15:08 Marti Maria (LittleCMS)]*
That should work, please contact me if any issue
*[15:08 Pekka Paalanen]*
Vikas, we have not written it yet.
*[15:08 Vikas K]*
ok
*[15:08 Marti Maria (LittleCMS)]*
yes, LCMS optimizes pipelines
*[15:09 Vikas K]*
Pekka, ok np
*[15:11 Pekka Paalanen]*
How much of LCMS optimization is in the GPL-licensed plugin "fast-float"? It did not seen like core LCMS optimized much other than skipping obviously unnecessary elements. Or do you refer to the 16-bit(?) LUT as the optimization?
*[15:11 Marti Maria (LittleCMS)]*
The big difference is the plug-in does the 3D interpolations in floating point. Otherwise other optimizations are already present in the MIT licensed core
*[15:11 Simon Ser]*
What optimizations apart from skipping no-ops, out of curiosity?
*[15:12 Pekka Paalanen]*
Ok. At least in Weston, we do see benefits from our own optimizer, like removing pairs of LUTs that are the inverse of each other.
*[15:12 Marti Maria (LittleCMS)]*
Marti Maria (LittleCMS) says:optimization i mean by joining adjacent curves, multiplying matrices or sometimes smelting everything in a single CLUT table. That latter is not easy to handle because is a loosy one.
*[15:13 Simon Ser]*
cool
*[15:14 Marti Maria (LittleCMS)]*
Sometime I can get rid of everything. Just consider sRGB to sRGB results in an empty pipeline
*[15:14 Pekka Paalanen]*
Marti, hmm, I don't remember seeing LCMS do that for me.
*[15:15 Vikas K]*
Lot of HDR media files do not provide Brightness information correctly, how do we deal with them to do appropriate tone mapping.
*[15:16 Uday Kiran]*
like min and max luminance
*[15:16 Marti Maria (LittleCMS)]*
Try it 😃 just measure how long takes to convert from sRGB to sRGB.
*[15:17 Pekka Paalanen]*
Vikas, if metadata is wrong, there is nothing we can do about it. Unless you want to pray the end user to adjust the image.
Marti, I have tried. I get a pipeline with curves-matrix-matrix-curves.
*[15:19 Vikas K]*
😃 I agree, how should we deal with shou we not enable HDR at all rather than showing content too bright or dull
*[15:20 Christopher Braga]*
Assumptions will have to be made about sane limits imo. If metadata is saying my content max is 10000 nits for example I wouldn't consider that realistic
*[15:21 Pekka Paalanen]*
You can guess something is not realistic, but you still have no idea what the right value would be.
*[15:21 Alex Goins]*
If it's absolute/PQ you can do histogram analysis per-frame and get some averages, maybe not the desired approach but it's not nothing
*[15:21 Marti Maria (LittleCMS)]*
Marti Maria (LittleCMS) says:double checked it. If you do a sRGB to sRGB you get a nop-like transform. I have checked the testbed on "Same matrix-shaper". Maybe you inhibited optimization by using flags?
*[15:22 Pekka Paalanen]*
Alex, applications could do that, and tell their results to the compositor. I would not make compositors do such analysis, and second-guess application reported metadata.
*[15:23 Alex Goins]*
Yeah just saying that it's technically possible that one could. It's not unheard of, e.g. Sony TVs ignore metadata from the source and use histogram analysis instead, although I wish they didn't
*[15:25 Vikas K]*
Christopher, assuming something is difficult for videos where there are shadows and bright highlight in different frames. Do you think computing brightness in Shader for smaller tiles of frame is one way of dealing with this ?
*[15:26 Pekka Paalanen]*
If a specific ICC profile file is problematic, then is it not problematic regardless of which component in the stack is handling it?
*[15:26 Christopher Braga]*
@Vikas Ah as in producing your own dynamic metadata? Yeah you would definitely be able to do better tone mapping that way, you would just have to be careful with latency.
*[15:28 Vikas K]*
Christopher, yes the latency blows up when i increase my tile size.
*[15:34 Pekka Paalanen]*
ICC profiles are in the protocol so that applications can skip compositor color management on outputs that are set to use ICC profiles. Math is just math, math is not a problem. No, ICC profiles are not parsed every frame. ICC profiles are not even combined every frame. ICC profiles describe transformations between two color spaces, loosely speaking. Complex profiles, simple CMMs. OTOH, parametric image descriptions describe *a* color space. Yes, that's why Weston uses LittleCMS, and does not even try reinvent ICC handling.
*[15:35 Uday Kiran]*
Pekka, the icc profiles will be created to do the conversion based on the icc file used by the client ?
*[15:35 Pekka Paalanen]*
Uday, I'm not sure what you're asking. An ICC profiles describes several different transformations between a device color space, and ICC specified profile connection space. It is a prescription on how to do that conversion. It is a very flexible and roundabout way to describe a device's color space.
*[15:44 Pekka Paalanen]*
Btw. it's perfectly fine to leave the ICC support as a Weston peculiarity for now. We'll see what comes out of it.
*[15:44 Simon Ser]*
Maybe should consider splitting ICC to a separate protocol
*[15:44 Pekka Paalanen]*
Why?
*[15:44 Simon Ser]*
Or a minor v2. If only weston supports it
*[15:44 Intel Conf Room]*
+1
*[15:44 Uday Kiran]*
+1
*[15:44 Pekka Paalanen]*
ICC support is already a completely orthogonal feature set.
*[15:45 Simon Ser]*
ICC can't be merged with just 1 server impl
*[15:47 Pekka Paalanen]*
Apps can tag their surfaces with ICC profile files, yes. You cannot always (usually?) extract information from ICC profiles. They contain primarily transformations, and extracting something like primaries from a 3D LUT is not a well-defined problem. Sorry, I was typing, so I missed the question.
*[15:47 Marti Maria (LittleCMS)]*
You should not mess with ICC internals. If you want primaries just create a transform from the profile to XZY and then feed the colorants with maximum values
*[15:49 Pekka Paalanen]*
Marti, then you assume that the device follows a simple matrix-shaper model anyway.
*[15:49 Marti Maria (LittleCMS)]*
Marti Maria (LittleCMS) says:not necessarely. I assume that an RGB profile would return XYZ of red if feed by 255, 0, 0
*[15:49 Pekka Paalanen]*
The white point of a monitor is what has been calibrated into the monitor.
*[15:50 Marti Maria (LittleCMS)]*
White point can be recovered by using absolut colorimetric intent and no adaptation
*[15:51 David Turner]*
Slightly stupid question: I just want to overlay/scanout a video plane and have the client mark it as BT.601/709/2020 so the compositor can set the DRM property and have it look vaguely right. Does the proposed protocol do that in a simple way? Is that what the enum values will do?
*[15:51 Pekka Paalanen]*
Marti, but you completely ignore what happens with all intermediate and mixed pixel values.
*[15:52 Marti Maria (LittleCMS)]*
If you mean how the profile works internally, yes. But that's not my business to now how the profile does the tric, I only care on the math results it give
*[15:52 Pekka Paalanen]*
David, that would be color-representation, if you mean the YUV-RGB conversion matrix.
*[15:52 Jonas Ådahl]*
microphone
*[15:55 Pekka Paalanen]*
Marti, my point is, if one reduces an ICC profile into a set of primaries and white point (and TRC), you will throw away information (non-linear effects and channel cross-talk). Maybe such approximation is sometimes ok, but I assume it is not ok.
Hence, I always use an ICC profile as-is, in full. Just pass values through it and see what comes out.
*[15:55 Marti Maria (LittleCMS)]*
You are completely right
*[15:55 Pekka Paalanen]*
Marti, thanks! 😃
*[15:55 Marti Maria (LittleCMS)]*
👍
*[15:57 David Turner]*
Right, it sounds like the protocol enum values for particular matrix coefficients is exactly what I want (so the compositor doesn't have to map from a set of matrix coefficients to a COLOR_ENCODING enum)
*[15:58 Pekka Paalanen]*
David, yes. We intentionally add enums in protocols, even if they are redundant with a bunch-of-numbers interface, so that the highest-level known information can be passed around.
*[15:58 David Turner]*
👍
*[15:58 Pekka Paalanen]*
I did miss quite a lot of the live discussion while typing in the chat here.
### Strategy for video and gaming use-cases
- Talk: ChromeOS video overlay plane offload - [Slides](https://docs.google.com/presentation/d/1JHp2UDpCa8qwqp1kGIlsrJ_TdG5fdyUY2EG69wBLeqM)
- Chrome/ium using VA-API on Linux flags:
https://chromium.googlesource.com/chromium/src/+/main/docs/gpu/vaapi.md#VaAPI-on-Linux (if flags are out dated or not working ping me)
#### Multiplane support in compositors
- Underlay, overlay, or mixed strategy for video and gaming use-cases;
- Can we improve the KMS Plane UAPI to simplify the plane arrangement problem?
- Is a shared plane arrangement algorithm desired? And if so, how should it be defined? (a shared library like libliftoff, kms uapi docs, or both?) How much of it should be defined vs. left for independent implementations to decide?
- Compositors would like a way to receive failure feedback from KMS to prune the combinatorial explosion
- Defining the failure reasons is non-trivial
- First step is to define a feedback mechanism, start with one generic reason: `-ENOBANDWIDTH`
- Big leap to achieve ideal offloading, small step to enable any offloading
- Take the small step, understand why KMS drivers fail offloading
- Have an unstable API somewhere for drivers to report failure reasons
- DebugFS
#### Action points
- Define a feedback mechanism
- Define a way of saying "not enough memory bandwidth"
- DebugFS interface to give more detailed reasons for debugging and analysis
- Some compositors like a shared allocation logic (i.e. libliftoff), but some wish to have their own logic
- Need to support both
#### Chat log:
*[16:23 Unknown]*
Will the slides be shared from the last session?
*[16:42 Simon Ser]*
Yeah, please post link to your slides in the shared notes!
Harry, i've tried a gamescope impl of the generic color pipeline API, but missing 3D LUT: https://github.com/ValveSoftware/gamescope/pull/1309
*[16:42 Harry Wentland]*
I have some 3DLUT locally... need to push it to a branch
*[16:43 Simon Ser]*
Ok, i'll just continue assuming it exists then 😃
*[16:43 Harry Wentland]*
It's not fully tested and I have to look more closely at it to see if I have enough confidence to send it 😄
*[17:14 Miguel Casas (Chrome)]*
Chrome/ium using VA-API on Linux flags --
https://chromium.googlesource.com/chromium/src/+/main/docs/gpu/vaapi.md#VaAPI-on-Linux
(if flags are out dated or not working ping me)
*[17:14 Simon Ser]*
These diagrams are sick!
*[18:12 Harry Wentland]*
Thanks for sharing. IIRC the idea was that if there is a standard KMS plane assignment algorithm, then kernel UAPI can be designed and drivers can tune it to work optimally.
*[18:24 Melissa Wen]*
[off-topic] @Simon, I was recently hacking drm_info to show colorop/color pipelines on steamdeck with harry patchset on the kernel, but maybe you already did something like that and have a branch to share?
*[18:25 Simon Ser]*
Oh nice! No i haven't. Would be great help though 😃
*[18:25 Intel Conf Room]*
Chaitanya: I was also thinking of adding color pipeline support on drm_info. @Melissa let me know if I can help 😃
*[18:29 Melissa Wen]*
@Simon, @Chaitanya great to hear it'd be useful. I'll share a hackish branch when I have it working well at least in one use-case
*[18:30 Uma Shankar]*
Nice discussions today, have a great evening. See you tomorrow !!!
---
## May 16
### Display Mux
- Presentation by Mario:
- <Link to slides?>
- [Old Nvidia proposal from 2022](https://lore.kernel.org/dri-devel/Y2qgnxjy%2F%2FWfAnUL@lenny/)
- Not sure if we'll realistically manage to copy complicated DRM state from one card to another
- They have very different capabilities.
- With buffer modifiers, you probably can't even just copy the buffers
- Maybe userspace should be responsible for doing this.
- It could do something much simpler, by moving to a really simple state with a single plane and no color-management (GPU rendering) and then just do a simple flicker-free switch to the new card.
- Why disable HPD?
- Races, phantom hotplug events during switch
- KMS devices are separate state, a KMS atomic commit cannot do cross-device sync
- Might need to be sysfs or IOCTL. TBD what the interface would really look like.
- Try enabling PSR to avoid blanking?
- If the system doesn't have PSR then you can't avoid flicker anyway.
- Can probably assume PSR as this is mostly high-end laptops and the system (GPU1/GPU2/panel) will be designed to work together.
- Can't probe connectors on the new GPU before they are connected
- Can we assume the old connector representing the same monitor will provide identical information?
- Bandwidth / link lanes could be different. Almost certainly fine for internal displays but maybe not for external.
- How to handle failures. You only find out about failures after you have already done most of the switch.
### Display Control
- Bunch of things I wished were there but aren't
#### HDR mode
- 2 InfoFrames (`HDR_OUTPUT_METADATA`, `Colorimetry`)
- Missing a few things
- Exposed in Vulkan: segmented backlight (on/off switch)
- [Intel eDP AUX interface seems to have this](https://patchwork.freedesktop.org/series/132009/) (has to be enabled all the time it seems?)
- Segmented backlighting on LCD displays: you can control the backlight of individual zones, you get more HDR, but artifacts like blooming around bright objects (eg. when moving mouse), bleeding/zones
- Do vendors want to expose things like this?
- On some gens it seems Intel is moving away from this protocol?
- External displays generally control this themselves
- Built in panels might not be able to do this at all
- For LCD really need local dimming for decent HDR (esp mobile: huge power impact)
#### Atomic backlight setting
- [Hans' patches from 2022](https://lore.kernel.org/dri-devel/0d188965-d809-81b5-74ce-7d30c49fee2d@redhat.com/)
- Ideally define specific semantics of backlight (ie. nits)
- Although not all vendors might be able to do this
- Interaction between this and `/sys/class/backlight`.
- Same issue and solution as ABM: As soon as a compositor starts doing atomic backlight, have the sysfs interface return `-EBUSY` or something until the drm client disappears.
- DDC-CI
- Conflicts between KMS and user-space both using it at the same time (`i2c-dev` module)
- How to sync up state (do we care?)
- AUX interface: has locking control
- Users seem to be ddcutil and KDE's backlight daemon
- When we're driving a display with absolute luminance PQ, maybe instead of messing with backlight we should be scaling the content luminance?
- Ideally pre-blending as that gives you more headroom (even though doing it post-blend is easier)
- Kinda weird that you're adjusting a panel-wide property by doing a per-plane operation, but it makes sense.
- Also, maybe the monitor should still be able to have brightness/contrast control?
#### Source-based tone mapping (no tone mapping in the monitor)
- Without it, we get unpredictable result
- Only time we don't want this might be fullscreen video playback
- Can vendors work with vesa to make this possible?
#### Big DRM/KMS wishlist document
- What do compositors want?
- https://dri.freedesktop.org/docs/drm/gpu/todo.html
- More of a janitorial todo-list than a high-level feature wishlist.
- Does have some backlight things on it
### Content-adaptive scaling and sharpening
- Intel are looking at applying a sharpening filter *post-blending* at the end of the pipeline
- The specific taps are secret sauce, the userspace API would just be a control to switch sharpening on/off and control the degree
- Challenge: being able to recreate the exact result in userspace/EGL
- Intel is post-blending so not a big issue
- AMD is pre-blending per-plane
- Might not be possible to share details of hardware implementations
- OpenGL/Vulkan have scaling filters which are pretty much equivalent across different hardware
- Sharpening is a user preference sort of thing
- Cosmic/kwin would just expose this as a slider in display preferences
- Is scaling defined for KMS? Up to drivers atm
- Each plane/CRTC has `SCALING_FILTER`: Default, NN
- Do we just want to add more options for scaling filters?
- Intel's original patchset had lots more options but they were removed because nothing was going to use them.
- default, medium, bilinear, nn, nn_is_only, edge_enhance
- Are we ok with having more magic in drivers which isn't fully reproducable by userspace?
- Should Wayland have a protocol to define the scaling filter/type for a surface?
- Depends if users notice
### Precise timings in VRR mode
amdgpu patches:
https://github.com/kleinerm/linux/commits/vrrexperiments_on5.13rc7
- There is benefit to doing this in the kernel it seems (scanline time precision)
- Could be in the compositor maybe with new KMS props discussed in earlier VRR session?
- Props would be applied for the next frame, would be an issue
- Locking behavior of driver might get in the way
### Future topics
- VTTY switching and HDR
- Multi-GPU support
- Current dmabuf API has issues - tricky to tell when you can import dmabufs around migration
- Should Wayland define exact semantics of blending (and scaling)?
- Just do something simple so compositors don't have to implement a crazy amount of stuff
- Simple linear vs non-linear blending?
- Browsers are the main users of interesting blending, what do they need?
- Can't really define this until fully until tone mapping is figured out.
- wayland-protocol discussion:
- potential protocol for clients to request sRGB-like blending or linear blending (or something close to them)
- Compositor would advertise which it supports and clients could choose one
- Compositors and what they care about and are working on
- Pi (labwc/wlroots): Plane offloading
- Mutter: plane offloading and color management
- Cosmic/smithay:
- Already doing plane offloading so hitting interesting bandwidth issues, want it to be more reliable.
- No plans for full color management pipeline in short-medium term, but basic HDR and color-spaces soon (shader implementation with libplacebo).
- kwin:
- HDR and color-management support is there, clients use a custom protocol.
- Overlay planes soon.
- Chrome:
- Have been offloading video for a while.
- Some parts of stack have had full color management for a while. But video overlay assumes all planes are in the same color-space.
- Working on addressing the flickering when display color primaries don't match content by limiting range to sRGB.
- HDR video entirely shader-based, would like to offload.
- Would like to do plane-based scrolling by moving overlay planes around.
- Weston:
- Have unverified color-management with ICC profiles for outputs and surfaces.
- Have multiplane support and really want HDR10+ support.
- Interested in offloading planes all the way from libcamera to output.
- But who allocates the planes?
- 90 degree rotations?
- Can libcamera use dma-heaps to allocate buffers for software ISP?
- More generic allocator API in KMS from the various heaps
- Using dma-heaps to do offload of software decoded video (for HDR and making it faster)
- Should the compositor or the client do this work?
- Compositor has better information to make this decision