<style> .reveal { font-size: 2.5em; } .reveal pre code { font-size: 1.5rem; } </style> #### Rust-Python FFI & multilanguage system debugging! <div class="avatar margin-bottom--sm"><div class="avatar__intro" itemprop="author" itemscope="" itemtype="https://schema.org/Person"><div class="avatar__name"><a href="https://github.com/haixuantao" target="_blank" rel="noopener noreferrer" itemprop="url"><span itemprop="name">Haixuan Xavier Tao</span></a></div><small class="avatar__subtitle" itemprop="description">Maintainer of dora-rs</small></div></div> <!-- Put the link to this slide here so people can follow --> <!-- .slide: data-background="https://hackmd.io/_uploads/HJaiXwgya.jpg" --> --- ## Case study of FFI issues and solutions. ![](https://hackmd.io/_uploads/S1qiK8hRh.png =300x300) > WebAssembly Interface Types: Interoperate with All the Things!: https://hacks.mozilla.org/2019/08/webassembly-interface-types/ <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> --- ## Rust-Python and `pyo3` <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ```rust use pyo3::prelude::*; #[pyfunction] fn sum_as_string(a: usize, b: usize) -> PyResult<String> { Ok((a + b).to_string()) } #[pymodule] fn string_sum(_py: Python<'_>, m: &PyModule) -> PyResult<()> { m.add_function(wrap_pyfunction!(sum_as_string, m)?)?; Ok(()) } ``` Called with ```python maturin develop python -c "import gosim_ffi; gosim_ffi.sum_as_string(1,1)" ``` --- ### Implementation 1: Default <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ```rust #[pyfunction] fn create_list(a: Vec<&PyAny>) -> PyResult<Vec<&PyAny>> { Ok(a) } ``` :::danger Calling `create_list` for `input = [1] * 100_000_000` is going to return in **2.272s**. ::: --- ### Implementation 2: PyBytes <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ```rust #[pyfunction] fn create_list_bytes<'a>(py: Python<'a>, a: &'a PyBytes) -> PyResult<&'a PyBytes> { let s = a.as_bytes(); let output = PyBytes::new_with(py, s.len(), |bytes| { bytes.copy_from_slice(s); Ok(()) })?; Ok(output) } ``` :::info For the same input, `create_list_bytes` returns in **0.078s**. That's **30x** better. ::: --- <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ### Implementation 3: [Apache Arrow](arrow.apache.org/) ```rust #[pyfunction] fn create_list_arrow(py: Python, a: &PyAny) -> PyResult<Py<PyAny>> { let arraydata = arrow::array::ArrayData::from_pyarrow(a).unwrap(); let buffer = arraydata.buffers()[0].as_slice(); let len = buffer.len(); // Zero Copy Buffer reference counted let arc_s = Arc::new(buffer.to_vec()); let ptr = NonNull::new(arc_s.as_ptr() as *mut _).unwrap(); let raw_buffer = unsafe { arrow::buffer::Buffer::from_custom_allocation(ptr, len, arc_s) }; let output = arrow::array::ArrayData::try_new( arrow::datatypes::DataType::UInt8, len, vec![raw_buffer], ) .unwrap(); output.to_pyarrow(py) } ``` :::success For the same input, `create_list_arrow` returns in **0.033s**. That's **2x** better than PyBytes. ::: --- ## Debugging <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> --- <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ### `.unwrap()` :::danger ``` thread '' panicked at 'called `Result::unwrap()` on an `Err` value: PyErr { type: < 'TypeError'>, value: TypeError('Expected instance of pyarrow.lib.Array, got builtins.int'), traceback: None }', src/lib.rs:45:62 run with `RUST_BACKTRACE=1` environment variable to display a backtrace Traceback (most recent call last): File "/home/peter/Documents/work/blogpost_ffi/test_script.py", line 79, in <module> array = blogpost_ffi.create_list_arrow(1) pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: PyErr { type: < 'TypeError'>, value: TypeError('Expected instance of pyarrow.lib.Array, got builtins.int'), traceback: None } ``` ::: --- ### [eyre](https://github.com/eyre-rs/eyre) <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ```rust #[pyfunction] fn create_list_arrow_eyre(py: Python, a: &PyAny) -> Result<Py<PyAny>> { let arraydata = arrow::array::ArrayData::from_pyarrow(a) .context("Could not convert arrow data")?; // ... } ``` :::success ```bash Could not convert arrow data Caused by: TypeError: Expected instance of pyarrow.lib.Array, got builtins.int Location: src/lib.rs:75:50 ``` ::: --- ## Calling Python from Rust <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ```rust #[pyfunction] fn call_func_eyre(py: Python, func: Py<PyAny>) -> Result<()> { let _call_python = func.call0(py).context("function called failed")?; Ok(()) } fn traceback(err: pyo3::PyErr) -> eyre::Report { let traceback = Python::with_gil(|py| err.traceback(py).and_then(|t| t.format().ok())); if let Some(traceback) = traceback { eyre::eyre!("{traceback}\n{err}") } else { eyre::eyre!("{err}") } } #[pyfunction] fn call_func_eyre_traceback(py: Python, func: Py<PyAny>) -> Result<()> { let _call_python = func .call0(py) .map_err(traceback) // this will gives you python traceback. .context("function called failed")?; Ok(()) } ``` --- <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> :::danger ``` ---Eyre no traceback--- function called failed Caused by: AssertionError: I have no idea what is wrong Location: src/lib.rs:89:39 ------ ``` ::: :::success ``` ---Eyre traceback--- function called failed Caused by: Traceback (most recent call last): File "/home/peter/Documents/work/blogpost_ffi/test_script.py", line 96, in abc assert False, "I have no idea what is wrong" AssertionError: I have no idea what is wrong Location: src/lib.rs:96:9 ------ ``` ::: --- ## Memory growth and memory leak <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> --- <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ```rust /// Unbounded memory growth #[pyfunction] fn unbounded_memory_growth(py: Python) -> Result<()> { for _ in 0..10 { let a: Vec<u8> = vec![0; 40_000_000]; let _ = PyBytes::new(py, &a); std::thread::sleep(Duration::from_secs(1)); } Ok(()) ``` :::danger This will consume 440MB of memory. ::: --- <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ```rust #[pyfunction] fn bounded_memory_growth(py: Python) -> Result<()> { py.allow_threads(|| { for _ in 0..10 { Python::with_gil(|py| { let a: Vec<u8> = vec![0; 40_000_000]; let _bytes = PyBytes::new(py, &a); std::thread::sleep(Duration::from_secs(1)); }); } }); // or for _ in 0..10 { let pool = unsafe { py.new_pool() }; let py = pool.python(); let a: Vec<u8> = vec![0; 40_000_000]; let _bytes = PyBytes::new(py, &a); std::thread::sleep(Duration::from_secs(1)); } Ok(()) } ``` :::success This will consume 80MB of memory. ::: --- ## Race condition: Lock and Python GIL <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> --- <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ```rust /// Function GIL Lock #[pyfunction] fn gil_lock() { let start_time = Instant::now(); std::thread::spawn(move || { Python::with_gil(|py| println!("Print after {:#?}", &start_time.elapsed())); }); std::thread::sleep(Duration::from_secs(10)); } ``` :::danger Printed after 10.0s ::: --- ```rust /// No gil lock #[pyfunction] fn gil_unlock() { let start_time = Instant::now(); std::thread::spawn(move || { std::thread::sleep(Duration::from_secs(10)); }); Python::with_gil(|py| println!("1. This was printed after {:#?}", &start_time.elapsed())); // or let start_time = Instant::now(); std::thread::spawn(move || { Python::with_gil(|py| println!("2. This was printed after {:#?}", &start_time.elapsed())); }); Python::with_gil(|py| { py.allow_threads(|| { std::thread::sleep(Duration::from_secs(10)); }) }); } ``` :::success "1" was printed after 32µs and "2" was printed after 80µs ::: <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> --- ### TRACING, METRICS, LOGS <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> --- ## [Opentelemetry](https://opentelemetry.io/) <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> ```rust /// No gil lock #[pyfunction] fn global_tracing(py: Python, func: Py<PyAny>) { global::set_text_map_propagator(TraceContextPropagator::new()); let _tracer = opentelemetry_jaeger::new_agent_pipeline() .with_endpoint("172.17.0.1:6831") .with_service_name("rust_ffi") .install_simple() .unwrap(); let tracer = global::tracer("test"); // Parent Trace, first trace let _ = tracer.in_span("parent_python_work", |cx| -> Result<()> { std::thread::sleep(Duration::from_secs(1)); let mut map = HashMap::new(); global::get_text_map_propagator(|propagator| propagator.inject_context(&cx, &mut map)); let output = func .call1(py, (map,)) .map_err(traceback) .context("function called failed")?; let out_map: HashMap<String, String> = output.extract(py).unwrap(); let out_context = global::get_text_map_propagator(|prop| prop.extract(&out_map)); std::thread::sleep(Duration::from_secs(1)); let _span = tracer.start_with_context("after_python_work", &out_context); // third trace Ok(()) }); } ``` --- In the python code, we can also add a tracespan: ```python def abc(cx): propagator = TraceContextTextMapPropagator() context = propagator.extract(carrier=cx) with tracing.tracer.start_as_current_span( name="Python_span", context=context ) as child_span: child_span.add_event("in Python!") output = {} tracing.propagator.inject(output) time.sleep(2) return output ``` and some `utils.py` ```python propagator = TraceContextTextMapPropagator() trace.set_tracer_provider( TracerProvider(resource=Resource.create({SERVICE_NAME: "python_ffi"})) ) tracer = trace.get_tracer(__name__) jaeger_exporter = JaegerExporter( agent_host_name="172.17.0.1", agent_port=6831, ) span_processor = BatchSpanProcessor(jaeger_exporter) trace.get_tracer_provider().add_span_processor(span_processor) ``` <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> --- ![](https://hackmd.io/_uploads/r1SQVtnAn.png) <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> --- # [dora-rs](https://github.com/dora-rs/dora) <!-- .slide: data-background="https://hackmd.io/_uploads/BktKQDlyp.png" --> Github: https://github.com/dora-rs/dora Website: https://dora.carsmos.ai/ DIscord: https://discord.gg/XqhQaN8P
{"title":"Rust-Python FFI talk","description":"Rust-Python FFI & multilanguage system debugging!","slideOptions":"{\"theme\":\"white\"}","contributors":"[{\"id\":\"dcd8580f-6041-4708-8c3f-3f0de43b5626\",\"add\":25543,\"del\":13979}]"}
    231 views