# High-performance modmesh ###### tags: `modmesh` We want to make modmesh to run very fast. Steps to be take: 1. Add in-house runtime profiling code. Starting with scope-based profiler. 2. Design and implement cache-friendly constructs. * SimpleArray is already cache-friendly. We need to profile to find runtime hotspot to make sure cache is working. 3. On top of the cache-friendly code, add SIMD (data parallelism). * x86, Neon, Apple Silicon. 4. On top of the SIMD-enabled code, add stream-processing (GPU) code (only for data parallelism). * Need controls for Apple Silicon, Intel, Nvidia, and AMD. *