Note:

I need to edit this note when I get the chance, so some things might not be up to date:

need to rewrite the parts that include SubView, since I misunderstood what it was for previously and the names mismatch here.
Also, need to expand on my thoughts regarding "full copy-less" a.k.a. sharing stacked camera results through intermediate textures. spoiler: I don't think it's even correct in all cases (bugs aside) and if it's the only way we can minimize copies then I think we should support it, but in a constrained way. There's some other architectural issues with it, but I'll elaborate on that later
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

Camera Restructure

Currently, Bevy Cameras are kind of a leaky abstraction: multi-camera setups require a lot of manual fiddling with viewports, and post-processing artifacts are common.

I'm proposing to remove Camera from its role as a fundamental rendering primitive and introduce a few new concepts to fill in the gap: Compositors and Views.

Issues with `Camera`

Sorting, Ordering, Clearing, Oh My!

There's a conflict within Camera that I think is the source of a lot of our issues: cameras are conceptually configured fully individually, with their own ordering index and viewport, rendering effects, even potentially a custom render graph, but we currently try really hard to share intermediate textures between cameras.

As a result, a lot of the compositing logic leaks into the rest of the graph: each camera has to recalculate projection matrices, pass through viewports, worry whether or not to clear the main textures or not, and whether or not to apply tonemapping.

Manual Viewports :(

Viewports are annoying to set up. Part of my rationale for doing "top down" layout from an external entity is that it opens up the possibility of nicer layout APIs in the future. If each camera stays in charge of choosing its own viewport, I don't see a good way forward there.

UI is Special. Why?

Currently, UI hierarchies render themselves to the entity referenced by their UiTargetCamera component. This Camera is expected to render the UI by calling the UI render graph at the appropriate point in its render graph. This is kind of a weird relationship– I think UI should be the same fundamental "renders to a render target" thing under the hood as cameras, and not require any special handling.

My Proposal

Cameras are no longer the primitive for rendering to a render target. They don't choose their own viewports, either. But, in return, they get full control over their own rendering flow and intermediate textures.

A few new primitives:

Compositor: manages layout and ordering for its child Views, and is where a RenderTarget is actually selected.

View: a child of a Compositor, which it requests a viewport from. Currently this is stupid simple, and keeps the same Camera-SubView API we have now. Each one has the responsibility of rendering to its assigned viewport once per frame, but the exact method is for each view to decide. Views can be cameras, UI canvases, or anything that needs a render graph and writes to a final output texture.

Some pros:

using immutable components for components like RenderTarget lets us respond to most changes in realtime, and to simplify camera_system
the "top down" nature of compositors means we get ordering for free through the CompositedViews relationship
Since cameras manage their own intermediate textures:
- we can remove (almost) all the atomics from ViewTarget (now ViewTextures)
- gives upscalers like dlss an easier time, since all views are "fullscreen"
- opens the door for better customizability (custom formats, low-res rendering)
- no explicit viewport handling in the majority of render graph nodes
Since the existence of a ViewTarget implies a properly setup compositor and render target, all of its methods can be infallible! (physical_target_size, logical_viewport_rect, etc). Contrast this with the same methods on Camera, which all return Option<...> currently.
unified rendering primitives: UI views don't need any special handling from the cameras they composite onto. They can just be handled like any other view, blended over the top of the game by the compositor
for the crate reorg:
- better separating concerns around Camera should make it a lot nicer to abstract into its own crate:
- bevy_ui can depend on bevy_render rather than bevy_camera

Some cons:

have to remember to order UI above its respective camera. We could probably detect this and warn.
harder to do copy-less. See below.

       RenderTargetChanged
       ┌────────────┐
       ▼            │
┌────┬────────────┬─┴────────────┬───────────────────┐
│ e1 │ Compositor │ RenderTarget │ RenderGraphDriver │
└────┴──────────┬─┴──────────────┴───────────────────┘
       ▲      ▲ │
       │      │ │
  ViewChanged │ └───────┬─┐ 
       │      │         │ │                           
       │ SubViewChanged │ │
       │      │         │ │
       │      │         ▼ │
┌────┬─┴────┬─┴───────┬────────────┬────────┬───────────────────┐         
│ e2 │ View │ SubView │ ViewTarget | Camera | RenderGraphDriver │
└────┴──────┴─────────┴────────────┴────────┴───────────────────┘         
       │      │           │                           
       │      │           ▼                           
┌────┬─┴────┬─┴───────┬────────────┬──────────┬───────────────────┐
│ e3 │ View │ SubView │ ViewTarget | UiCanvas │ RenderGraphDriver │
└────┴──────┴─────────┴────────────┴──────────┴───────────────────┘

Note: see below. `SubView` and `ViewTarget` are slightly mismatched from their current definitions in the engine. Need to edit/clarify this better later

What does it look like in the render world?

extract compositors and views
run compositor render graphs

default compositor render graph calls each of its child views in order, then presents the result
compositor either maintains its own intermediate texture or has views write directly to the swapchain, depending on what's faster for each platform

views create and manage their own intermediate textures (or not) and are always conceptually "fullscreen" until they blit to the compositor. See copy-less section below, I don't think this is worth compromising in the general case to eliminate ~1 copy per camera.

MSAA writeback

The compositor gives each view a TextureView and Viewport for the view target. We can just read back from this at the start of the graph!

Actually compositing

Each camera gets to decide on its own how to write to the view target, but we can provide a "standard" node to blit/copy from the main intermediate texture to the ViewTarget.

What would the new use of `Camera` be?

If we remove all the bits of camera that actually drive rendering, what's the use of keeping Camera around?

Well, it's still useful for all the stuff that needs the physical metaphor of a camera! Projection transforms, the concept of space (#[require(Transform)]), and all the machinery for visibility checking, etc. We can also get more opinionated about the "standard" feel of what a camera does in the future, since it wouldn't be as fundamental to rendering. Like, I'm imagining once we have render-graphs-as-schedules, I want to provide some common system sets so third party crates can integrate with more than the built-in render graph.

I still want to keep Camera around and have it stay important, but I think the more we want to clean the renderer up and solve bugs with compositing, its scope needs to be smaller than it is now.

Compositors

#[derive(Component, Default, MapEntities)]
#[require(
    RenderTarget,
    CompositedViews,
    RenderGraphDriver::new(DefaultCompositorGraph),
    SyncToRenderWorld
)]
pub struct Compositor {
    // This is the actual list of active views, and is a subset of
    // the `CompositedViews` component. it also dictates the order
    // in which the views are rendered.
    #[entities]
    views: Vec<Entity>,
    target: Option<Arc<(NormalizedRenderTarget, RenderTargetInfo)>>,
}

// now an immutable component on compositors!!!
#[derive(Component, Debug, Clone, Reflect, From)]
#[component(
    immutable, 
    on_insert = Self::on_insert, // trigger `CompositorEvent::RenderTargetChanged`
    on_remove = Self::on_remove  // reinsert default if on a compositor
)]
#[reflect(Clone)]
pub enum RenderTarget {
    Window(WindowRef),
    Image(ImageRenderTarget),
    TextureView(ManualTextureViewHandle),
}

#[derive(Component, Default)]
#[relationship_target(relationship = CompositedBy)]
pub struct CompositedViews(Vec<Entity>);

Views

#[derive(Component, Default)]
#[component(
    immutable, 
    on_insert = Self::on_insert, // trigger `CompositorEvent::ViewChanged`
    on_remove = Self::on_remove  // same as above
)]
#[require(RenderGraphDriver, SyncToRenderWorld)]
pub enum View {
    Disabled,
    #[default]
    Enabled,
}

// Sorry for the overloaded name! I've renamed the other `ViewTarget`
// to `ViewTextures`. All the viewport- and target-related methods
// previously on `Camera` are on `ViewTarget` now, and are no longer 
// fallible.
//
// This is completely managed by the parent `Compositor` in response
// to layout events.
#[derive(Component, Clone)]
pub struct ViewTarget {
    target: Arc<(NormalizedRenderTarget, RenderTargetInfo)>,
    viewport: Option<Viewport>,
}

// Same as we have now, just as an immutable component
// (and renamed from `CameraSubView`). A full fancy layout api
// can come later, and needs some design first :p
#[derive(Debug, Component, Clone, Copy, Reflect, PartialEq)]
#[component(
    immutable, 
    on_insert = Self::on_insert // trigger `CompositorEvent::SubViewChanged`
    on_remove = Self::on_remove // reinsert default if on a view
)]
#[reflect(Clone, PartialEq, Default)]
pub struct SubView {
    pub full_size: UVec2,
    pub offset: Vec2,
    pub size: UVec2,
}

#[derive(Component)]
#[relationship(relationship_target = CompositedViews)]
pub struct CompositedBy(pub Entity);

// the newly-renamed version of `CameraSubGraph`, 
// which is used for both Views *and* Compositors
#[derive(Component)]
pub struct RenderGraphDriver(InternedRenderSubGraph);

Future Work/Questions

Compositing isn't simple

There's a lot of more esoteric ways to composite rendering effects, for example with world-space render targets (think portals or mirrors), stencil-based compositing, or interleaving 2d and 3d elements in a transform hierarchy. While I think the model proposed here is a dramatic improvement over what we have currently, we should consider the direction we want to take moving forward.

Copy-less compositing

Texture copies can be expensive, so we want to minimize them where possible. It should always be possible to eliminate them entirely in constrained circumstances (no per-camera post-processing, all cameras have same settings). Currently, the way we do this is through sharing camera results through ordering and intermediate textures, which is incredibly bug-prone, and I don't believe this is the way we should move forward (at least in the general case)

RTaE?

Bevy Windows are already entities, so I don't think it's that much of a stretch to imagine modeling all render targets as entities, and using 1:1 relationships to prevent conflicts. We could also merge RenderTarget/NormalizedRenderTarget once we have Construct

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Picking Integration

Picking is pretty crucial to get right, but I don't have any experience with it really. If there's a part of this that might compromise bevy_picking or could be made to better work with it, please let me know :)

Robert Swain

2025/05/13 09:21:41

See copy-less section below, I don't think this is worth compromising in the general case to eliminate ~1 copy per camera.

It depends. Maybe on desktop, console, or on other high memory bandwidth platforms, it is fine. But on mobile, or platforms using lower memory bandwidth, it most likely isn't. I think we need to have a solution that avoids copies in situations where that is necessary. It can be OK for ergonomics if that is not the default, but I think it is something we need to cater for. So providing APIs to allow zero-copy, and an example that does it, should be the minimum, in my opinion.

robtfm

2025/05/13 10:29:53

- compositor: good idea, this is definitely missing today (only loosely available via camera ordering and sharing of targets) - ui views are not special: good idea - cameras manage their own intermediate textures: provided it doesn't explode the vram requirements, fine ... imo cameras must still use some kind of pool of intermediate textures to avoid this -all views are now fullscreen / no explicit viewport handling: sounds like you don't plan to share intermediate textures if they are not same size. not sure that's a good tradeoff, we increase vram usage just to avoid using viewports. can't we just abstract the viewport part out, so when a camera requests an input/output/intermediate from the compositor, it gets the viewport defined too? - Actually compositing / "standard" node to blit/copy from the main intermediate texture to the ViewTarget: suggest the compositor handle this, including whether to overwrite/blend/etc - ~1 extra copy per camera is like, probably fine?: :grimace:

Emerson Coskey

2025/05/13 15:17:12

- Definitely agree that we should be worried about vram use, and that we should have a pool of intermediate textures. - I think "always fullscreen" gives us more benefit than not having to deal with viewports: sharing previous camera results through the intermediate texture is pretty much broken rn iiuc. It also assumes a lot is common between cameras which might not always be true. Unless you meant sharing textures but *not* contents, which would be something closer to unity's RTHandle stuff. I'd be more on board with that, even though they're a slight pain to work with. (Edited)

2025/05/13 15:17:54

- I think putting the actual "compositing" pass in the hands of each view gives us a lot of flexibility in how we handle things. It'd better support setups with non-traditional intermediate textures, and views can eagerly choose when they need an extra copy or not. For example, I think UI never needs an extra copy, and if there's no unique per-camera post-processing to do we can resolve the msaa texture directly into the view target