# Thread-local data, the ECS, and you > Hi. What's going on? Thread-local data, or `!Send` data, is data that cannot be moved or dropped away from the thread it was constructed in. This data is pinned to a thread and this property virally extends to any types containing it. > OK, why are we talking about thread-local data? A normal Bevy application can do something like this. ```rust= fn main() { App::new() .add_plugins(DefaultPlugins) /* configuration */ .init_non_send_resource(SomeNonSendType); /* more configuration */ .run(); } ``` In this snippet, a user creates an app with some plugins, including an instance of thread-local data, and then they start running the app. Because the thread-local data is created upfront, the `World` (and `App`) is immediately pinned to the main thread. > OK, so? ## Problem(s) ### Getting Accurate, Hi-Res OS Event Timestamps The main thread is also where the `winit` (or maybe SDL) event loop lives. Most platforms expect this event loop to be the outermost scope on this thread, i.e. they want it to "own" the thread. Thus, the `World` must be updated as part of the [event handler](https://docs.rs/winit/latest/winit/application/trait.ApplicationHandler.html). But doing this has the unfortunate side effect of blocking the event loop while an update is happening. Blocking the event loop sucks because doing that prevents getting accurate, hi-res timestamps for the events. > Why does that matter? Accurate, hi-res event timestamps are essential if an app wants to— - compensate for the delay between when an event happens and when a frame or tick finally sees it (rhythm games especially) - deterministically figure out which frame or tick was supposed to see it > Isn't that winit's problem? Surely the OS has already timestamped the events. Can't winit just start including those timestamps? In a better world, that'd be true. But I've tested this myself and what I found is that while, yes, all platforms of interest do provide timestamps with their events, some don't have the resolution we need. (Windows, [whose message timestamps are aliased to multiples of 16ms](<https://github.com/rust-windowing/winit/issues/1194#issuecomment-1809022665>) *in the best case*, is the biggest example.) And since this is an OS limitation, not a `winit` problem, I'm almost certain that switching to `sdl2` or `sdl3` won't help. ## Solution(s) So in summary, we can't have accurate event timestamps on all platforms without moving the `World` to a different thread, but the `World` is pinned to the main thread by the **thread-local data created and stored inside it before the event loop starts.** All problems start there, so a solution would entail— - Storing the thread-local data in something that isn't the `World`. Then you could at least move the `World` to another thread. OR - Creating the thread-local data in some other thread that isn't where the event loop lives. It's fine if the `World` gets pinned to another thread. OR - Doing both of those. ### Storing `!Send` Elsewhere (aka "move it out of the `World`") Strictly speaking, all that's needed to implement this fix is to make the `Resource` trait require `Send`. That change alone would force users to adapt their code. But we could be a lot nicer about it. As a replacement for the lost functionality, Bevy could provide a thread-local storage type (more friendly to use than raw `thread_local!` variables) as well as a handle for systems to interact with it. Here's what a relevant system might look like before and after this change. ```rust= // BEFORE fn some_system( /* ... */ data: NonSendMut<SomeNonSendType>, /* ... */ ) { /* ... */ } ``` ```rust= // AFTER fn some_system( /* ... */ worker: Res<TlsWorker>, /* ... */ ) { /* ... */ worker.run(|tls: &mut Tls| { if let Some(data) = tls.get_mut::<SomeNonSendType>() { /* ... */ } }); /* ... */ } ``` See the **Appendix A** for example implementation details for `Tls` and `TlsWorker`. #### Bonus: Reduce the Number of System Params The dichotomy of `Send` and `!Send` resources is pretty much the only reason Bevy has *six* types of references. - `Ref<T>` - `Mut<T>` - `Res<T>` (`Ref<T>` for `Send` resources) - `ResMut<T>` (`Mut<T>` for `Send` resources) - `NonSend<T>` (`Ref<T>` for `!Send` resources) - `NonSendMut<T>` (`Mut<T>` for `!Send` resources) #### Bonus: Reduce Scheduler Complexity If the `World` and the thread-local storage live in different threads *and* systems have an API like this to send work, the systems themselves could be run on any thread. The scheduler would no longer need to branch on thread-local access. ### Creating `!Send` Elsewhere To ensure thread-local data can only be created away from the event loop, the `App` and `Plugin` API must change. No application code can be allowed to run until the event loop (or whatever the "main context" happens to be) is ready. This means the application lifecycle must be broken up into separate callback functions. #### Trait-based implementation ```rust= pub trait App: Sized { fn run() { run_once::<Self>(); } // user would add plugins, etc. in this callback fn setup(world: &mut World); fn start(world: &mut World) { /* run startup and main loop */ } } pub fn run_once<A: App>() { let mut world = World::new(); A::setup(&mut world); A::start(&mut world); } ``` ```rust= struct MyApp; impl App for MyApp { fn run() { bevy_winit::run_app::<MyApp>(); } fn setup(world: &mut World) { world.add_plugins(DefaultPlugins); /* ... */ } } fn main() { MyApp::run(); } ``` See **Appendix B** for a simplified example of what `bevy_winit::run_app` could look like. #### Struct-based implementation ```rust= pub struct App { pub run: Box<dyn FnOnce(App)>, pub setup: Box<dyn FnOnce(&mut World)>, pub start: Box<dyn FnOnce(&mut World)>, } pub fn run_once(app: App) { let App { _, setup, start } = app; let mut world = World::new(); setup(&mut world); start(&mut world); } ``` ```rust= fn my_setup(world: &mut World) { world.add_plugins(DefaultPlugins); /* ... */ } fn main() { let mut app = App::new(); app.runner(bevy_winit::run_app); app.setup(my_setup); app.run(); } ``` #### Bonus: Automatic Resolution of Plugin Build Order As a bonus, delaying the construction of the `World` and any plugins will unblock work for automating plugin build order based on their dependencies. ## Related ### `Component` requires `Send` but... #### "resources as entities" The phrase "resources as entities" refers to changing how singleton data is stored, from being stored separately into regular components that are each restricted to a single entity. Well, "resources" currently includes thread-local ones, so we have to entertain the idea of relaxing the requirements of the `Component` trait to support thread-local components too. ```rust // components cannot be thread-local pub trait Component: Send + Sync + 'static {} ``` ```rust // components can be thread-local pub trait Component: 'static {} ``` But if components aren't thread-safe, then `World` isn't thread-safe either, which seems like a major loss. If we look at the plugin ecosystem, we see many crates using thread-safe types to act as proxies for thread-local ones. To give a few examples, `gilrs` (input) and `kira` (audio) spawn threads to manage their thread-local objects (at least, [when it's possible to do so](https://github.com/WebAudio/web-audio-api/issues/2423)). Bevy itself proxies thread-local window objects from the `winit` backend using window entities. Encouraging this proxy pattern seems like a better idea than thread-local components. #### `web_sys` and `js_sys` types are `!Send` JavaScript is a single-threaded runtime, so its objects have no inherent thread-safety, and so the types from the crates that bind them are all `!Send` and `!Sync`. This means some types that are `Send` on other platforms become `!Send` on web. Web has support for shared memory in the form of `SharedArrayBuffer`, but not by default (because of Spectre). Specific CORS settings are required to use it. With `SharedArrayBuffer`, you can implement the `TlsWorker` or proxy patterns described earlier. Without `SharedArrayBuffer`, `Send` and `Sync` are meaningless, so there's no harm in relaxing the requirements of the `Component` with a feature flag like this. ```rust #[cfg(any(not(target_family = "wasm"), feature = "shared_array_buffer"))] pub trait CfgSend: Send {} #[cfg(any(not(target_family = "wasm"), feature = "shared_array_buffer"))] pub trait CfgSync: Sync {} #[cfg(all(target_family = "wasm", not(feature = "shared_array_buffer"))] pub trait CfgSend {} #[cfg(all(target_family = "wasm", not(feature = "shared_array_buffer"))] pub trait CfgSync {} ``` ```rust // constraints can be relaxed for select target platforms pub trait Component: CfgSend + CfgSync + 'static { /* ... */ } pub trait Resource: CfgSend + CfgSync + 'static { /* ... */ } ``` But even without `SharedArrayBuffer`, web workers can still exchange serialized data. Some JS types, like `ArrayBuffer`, are [transferable objects](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects) that can be moved from one worker to another, even though `js_sys` [says they're not `Send`](https://docs.rs/js-sys/latest/js_sys/struct.ArrayBuffer.html#impl-Send-for-ArrayBuffer). (Rust's and the web's semantics on "transfer across thread boundaries" seem to be a bit different). So while you can't transfer boxed trait objects from one worker to another without `SharedArrayBuffer`, if you can use serializable commands (or VM bytecode) as a replacement, you can still implement the proxy pattern. ## Appendix A ### Example Implementation of TLS ```rust= use core::any::TypeId; use core::cell::RefCell; use core::marker::PhantomData; use core::mem; use core::ptr::NonNull; use std::collections::HashMap; use std::panic::{AssertUnwindSafe, catch_unwind, resume_unwind}; use std::sync::mpsc::{Sender, sync_channel}; use marker::*; mod marker { /// A type parameter indicating a type that is not [`Send`]. pub(crate) struct NonSend(PhantomData<*const ()>); // SAFETY: This type is a marker meant to remove `Send` only. unsafe impl Sync for NonSend {} /// A type parameter indicating a type that is not [`Sync`]. pub(crate) struct NonSync(PhantomData<*const ()>); // SAFETY: This type is a marker meant to remove `Sync` only. unsafe impl Send for NonSync {} } thread_local! { static TLS: RefCell<Tls> = RefCell::new(Tls::new()); } /// A rudimentary thread-local storage. pub struct Tls { data: HashMap<TypeId, TlsItem>, _phantom_non_send: PhantomData<NonSend>, _phantom_non_sync: PhantomData<NonSync>, } impl Tls { pub(crate) fn new() -> Self { Self { data: HashMap::new(), _phantom_non_send: PhantomData, _phantom_non_sync: PhantomData, } } pub fn insert<T: 'static>(&mut self, value: T) -> Option<T> { self.data .insert(TypeId::of::<T>(), TlsItem::new(value)) .map(|item| item.into_inner()) } pub fn get<T: 'static>(&self) -> Option<&T> { self.data .get(&TypeId::of::<T>()) // .map(|item| item.as_ref()) } pub fn get_mut<T: 'static>(&mut self) -> Option<&mut T> { self.data .get_mut(&TypeId::of::<T>()) .map(|item| item.as_mut()) } pub fn remove<T: 'static>(&mut self) -> Option<T> { self.data .remove(&TypeId::of::<T>()) .map(|item| item.into_inner()) } } struct TlsItem { type_id: TypeId, data: NonNull<u8>, drop: unsafe fn(NonNull<u8>), } impl TlsItem { pub fn new<T: 'static>(value: T) -> Self { unsafe fn drop_ptr<T>(ptr: NonNull<u8>) { // TODO: safety comment let _ = unsafe { Box::from_raw(ptr.cast::<T>().as_ptr()) }; } let boxed = Box::new(value); // TODO: safety comment let typed = unsafe { NonNull::new_unchecked(Box::into_raw(boxed)) }; let erased = typed.cast::<u8>(); Self { type_id: TypeId::of::<T>(), data: erased, drop: drop_ptr::<T>, } } fn type_matches<T: 'static>(&self) -> bool { self.type_id == TypeId::of::<T>() } pub fn into_inner<T: 'static>(self) -> T { assert!(self.type_matches::<T>()); let typed = self.data.cast::<T>(); // TODO: safety comment let boxed = unsafe { Box::from_raw(typed.as_ptr()) }; let value = *boxed; value } pub fn as_ref<T: 'static>(&self) -> &T { assert!(self.type_matches::<T>()); let typed = self.data.cast::<T>(); // TODO: safety comment unsafe { typed.as_ref() } } pub fn as_mut<T: 'static>(&mut self) -> &mut T { assert!(self.type_matches::<T>()); let mut typed = self.data.cast::<T>(); // TODO: safety comment unsafe { typed.as_mut() } } } impl Drop for TlsItem { fn drop(&mut self) { let Self { data, drop, .. } = self; // TODO: safety comment unsafe { drop(*data) } } } /// A handle that, when dropped, drops all data stored inside the [`Tls`]. pub struct TlsDropHandle { thread_id: std::thread::ThreadId, _phantom_non_send: PhantomData<NonSend>, _phantom_non_sync: PhantomData<NonSync>, } impl Default for TlsDropHandle { fn default() -> Self { Self { thread_id: std::thread::current().id(), _phantom_non_send: PhantomData, _phantom_non_sync: PhantomData, } } } impl Drop for TlsDropHandle { fn drop(&mut self) { // Drop resources normally to avoid the caveats described in // https://doc.rust-lang.org/std/thread/struct.LocalKey.html TLS.replace(Tls::new()); } } impl TlsDropHandle { /// Constructs a new [`TlsDropHandle`]. pub fn new() -> Self { Self::default() } } /// A type alias for tasks to be run by a worker in the TLS thread. pub type Task = Box<dyn FnOnce() + Send + 'static>; /// A handle used to submit work that accesses the "main" [`Tls`]. #[derive(Resource)] pub struct TlsWorker { thread_id: std::thread::ThreadId, task_tx: Sender<Task>, } impl TlsWorker { /// Runs `f` in a scope with exclusive access to the [`Tls`]. pub fn run<F, T>(&mut self, f: F) -> T where F: FnOnce(&mut Tls) -> T + Send, T: Send + 'static, { if self.is_local_thread() { self.run_from_local_thread(f) } else { self.run_from_foreign_thread(f) } } #[inline] fn is_local_thread(&self) -> bool { self.thread_id == std::thread::current().id() } #[inline] fn run_from_local_thread<F, T>(&mut self, f: F) -> T where F: FnOnce(&mut Tls) -> T + Send, T: Send + 'static, { assert!(self.is_local_thread()); TLS.with_borrow_mut(|tls| f(tls)) } #[inline] fn run_from_foreign_thread<F, T>(&mut self, f: F) -> T where F: FnOnce(&mut Tls) -> T + Send, T: Send + 'static, { assert!(!self.is_local_thread()); // create channel to receive result let (result_tx, result_rx) = sync_channel(1); let task = move || { TLS.with_borrow_mut(|tls| { // panic in the caller thread, not the TLS thread let result = catch_unwind(AssertUnwindSafe(|| f(tls))); result_tx.try_send(result).unwrap(); }); }; let task: Box<dyn FnOnce() + Send> = Box::new(task); // SAFETY: This task can only execute once and this function blocks until the task is // executed, so all captured references live at least as long as this function call. let task: Box<dyn FnOnce() + Send + 'static> = unsafe { mem::transmute(task) }; // send task to the TLS thread self.task_tx .send(task) .unwrap_or_else(|_| panic!("receiver missing")); // block until task completes match result_rx.recv().unwrap() { Ok(result) => result, Err(payload) => { resume_unwind(payload); } } } } ``` ## Appendix B ### Example `bevy_winit::run_app` ```rust= /// Creates a `winit` event loop in the current thread and, in a separate thread, /// creates and starts the app. pub fn run_app<A: App>() { let event_loop = build_platform_event_loop(); let waker = event_loop.create_proxy(); let (winit_send, world_recv) = std::sync::mpsc::channel::<WinitEvent>(); let (world_send, proxy_recv) = std::sync::mpsc::channel::<AppEvent>(); let (proxy_send, winit_recv) = std::sync::mpsc::channel::<AppEvent>(); let elp_thread = std::thread::Builder::new() .name("winit-event-loop-proxy".to_string()) .spawn(move || { while let Ok(event) = proxy_recv.recv() { proxy_send.send(event).unwrap(); waker.wake_up(); } }) .unwrap(); let app_thread = std::thread::Builder::new() .name("app".to_string()) .spawn(move || { let result = catch_unwind(AssertUnwindSafe(|| { // construct world let mut world = World::new(); world.insert_resource(WinitChannel::new( world_send, world_recv, )); // setup app A::setup(&mut world); // start app A::start(&mut world); })); if let Some(panic_payload) = result.err() { /* send panic back to main thread */ } }) .unwrap(); // run the event loop // (this can simply redirect most events over the channel) event_loop.run_app(WinitApp::new(winit_send, winit_recv)); } ```