Here are some of the most popular and widely used [DSP](https://www.ampheo.com/c/dsp-digital-signal-processors) ([Digital Signal Processing](https://www.ampheo.com/c/dsp-digital-signal-processors)) ([What is a DSP?](https://adrianchad.blogspot.com/2025/01/what-is-dsp-digital-signal-processor.html))libraries for embedded systems:

**1. CMSIS-DSP (Arm Cortex-M series)**
Platform: ARM Cortex-M (e.g., [STM32](https://www.ampheo.com/search/STM32), [NXP](https://www.ampheo.com/manufacturer/nxp-semiconductors), etc.)
Features:
* Fast math functions (FFT, FIR, IIR, convolution)
* Matrix operations
* Filtering, statistical functions
Optimized for: Hardware acceleration using SIMD on Cortex-M4, M7, M33, M55
Maintained by: Arm
License: BSD-like (very permissive)
Best for: STM32 and any Arm-based MCU
**2. KissFFT**
Platform: Cross-platform (C-based)
Features:
* Lightweight, portable FFT library
* Fixed-point and floating-point support
License: BSD
Use case: FFTs in resource-constrained devices
Best for: Small MCUs with custom FFT needs
**3. DSP Library from Texas Instruments (TI DSP Library)**
Platform: [TI](https://www.ampheo.com/manufacturer/texas-instruments) C2000, [MSP430](https://www.ampheo.com/search/MSP430), and other TI DSPs
Features:
* Optimized fixed-point and floating-point DSP functions
* FFT, IIR/FIR, filter design
Highly optimized for TI’s proprietary cores (e.g., C28x)
Best for: TI DSP chips and real-time control systems
**4. PULP-DSP**
Platform: RISC-V (PULP architecture)
Features:
* DSP kernel functions for energy-efficient processors
* Designed for parallelism and low power
License: Open source (Solderpad/MIT)
Best for: RISC-V based low-power embedded systems
**5. Eigen (Header-only C++ Library)**
Platform: General embedded systems with C++ support
Features:
* Matrix and vector algebra
* Works well with ARM + FPU
Not optimized for MCU scale but usable with fast processors
Best for: Medium/high-end MCUs with C++ needs
**6. ARM Compute Library**
Platform: ARM Cortex-A (e.g., [Raspberry Pi](https://www.ampheo.com/c/raspberry-pi/raspberry-pi-boards), ARM SoCs)
Features:
* NEON-accelerated [DSP](https://www.ampheoelec.de/c/dsp-digital-signal-processors) and machine learning
* FFT, convolution, pooling
Best for: Embedded Linux + DSP/AI workloads
**Comparison Table**

**Conclusion**
If you're using [STM32](https://www.onzuu.com/search/STM32), CMSIS-DSP is by far the most compatible and optimized option.
For lightweight or custom use: KissFFT or [TI](https://www.onzuu.com/manufacturer/texas-instruments) [DSP](https://www.onzuu.com/category/dsp) Library (if using TI chips).
For RISC-V systems: go with PULP-DSP.