[FPGAs](https://www.ampheo.com/c/fpgas-field-programmable-gate-array) feel “low memory” because the memory you get on the FPGA chip itself is optimized for fast, parallel hardware work, not for storing lots of data like a CPU/[SoC](https://www.ampheo.com/c/system-on-chip-soc). ![Screenshot 2024-11-06 at 11](https://hackmd.io/_uploads/HJcXF9LX-e.png) Here’s what drives that. **1) Silicon area: SRAM is expensive, and FPGA logic is already “overhead-heavy”** * On-chip FPGA memory is almost always SRAM blocks (BRAM/URAM). * SRAM takes a lot of die area, and an FPGA already spends huge area on programmable routing + configuration bits (the “reconfigurable tax”). So vendors can’t pack smartphone-style memory sizes without the chip becoming enormous and expensive. **2) FPGA memory is built for bandwidth, not capacity** BRAM/URAM is designed to give you: * Many independent ports (often dual-port) * Lots of parallel accesses * Predictable single-digit-cycle latency That’s perfect for FIFOs, line buffers, [filter](https://www.onzuu.com/category/filters) taps, caches, packet buffers… but it’s not meant to hold gigabytes. **3) More on-chip RAM would also hurt timing, power, and yield** * Bigger dies → worse manufacturing yield → higher cost * More SRAM arrays → higher leakage and dynamic power * Longer wires across the chip → harder timing closure So vendors choose a balanced amount of BRAM so the device is usable and affordable. **4) Most FPGA applications stream data, so they don’t need huge on-chip storage** A lot of FPGA workloads are: * video processing pipelines * networking packet processing * [ADC](https://www.onzuu.com/category/analog-to-digital-converters)/[DAC](https://www.onzuu.com/category/digital-to-analog-converters) [DSP](https://www.ampheo.com/c/dsp-digital-signal-processors) chains * motor control / industrial I/O These are streaming: data comes in, is processed in a pipeline, and goes out. You typically need buffers, not massive storage. **5) “Big memory” is usually external (DDR/HBM/QSPI), not inside the FPGA fabric** FPGAs scale memory by attaching: * DDR3/DDR4/DDR5 (common on boards) * HBM on high-end FPGAs (very high bandwidth, still not “free”) * QSPI flash for configuration / logs (slow, cheap) So the chip’s internal BRAM is like L1/L2 scratchpad, and external DDR/HBM is the “real RAM.” **6) You can build larger memory out of logic, but it’s inefficient** Using LUTs as RAM (distributed RAM) is great for small tables, but for large arrays it burns logic and routing quickly—another reason it “feels” limited. **Mental model (helps a lot)** * MCU/CPU: lots of RAM, runs instructions sequentially * [FPGA](https://www.ampheoelec.de/c/fpgas-field-programmable-gate-array): less on-chip RAM, but can do many operations in parallel and move data with massive bandwidth