Implement computer vision algorithms with RISC-V Vector extension

contributed by Terry7Wei7

TEST 1

I found materials on RISCV Official website regarding the Running 64- and 32-bit RISC-V Linux on QEMU, and I'm attempting to follow the instructions to implement it myself
ubuntu 20.04
Running 32-bit RISC-V Linux on QEMU
Use Ubuntu 22.04

Prerequisites

$sudo apt install autoconf automake autotools-dev curl libmpc-dev libmpfr-dev libgmp-dev \
                 gawk build-essential bison flex texinfo gperf libtool patchutils bc \
                 zlib1g-dev libexpat-dev git

Getting the sources

$mkdir riscv32-linux
$cd riscv32-linux

Then download all the required sources, which are:

QEMU
Linux
Busybox

$git clone https://github.com/qemu/qemu
$git clone https://github.com/torvalds/linux
$git clone https://git.busybox.net/busybox

You will also need to install a RISC-V toolchain

$git clone https://github.com/riscv/riscv-gnu-toolchain
$sudo apt-get install autoconf automake autotools-dev curl python3 libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev ninja-build
./configure --prefix=/opt/riscv --with-arch=rv32gc --with-abi=ilp32d
$make linux

$sudo apt-get install autoconf 
$sudo apt-get install automake 
$sudo apt-get install autotools-dev 
$sudo apt-get install curl 
$sudo apt-get install python3 
$sudo apt-get install libmpc-dev 
$sudo apt-get install libmpfr-dev 
$sudo apt-get install libgmp-dev 
$sudo apt-get install gawk 
$sudo apt-get install build-essential 
$sudo apt-get install bison 
$sudo apt-get install flex 
$sudo apt-get install texinfo 
$sudo apt-get install gperf 
$sudo apt-get install libtool 
$sudo apt-get install patchutils 
$sudo apt-get install bc 
$sudo apt-get install zlib1g-dev 
$sudo apt-get install libexpat-dev
$sudo apt-get install libnewlib-dev
$sudo apt-get install device-tree-compiler

git clone https://github.com/riscv/riscv-gnu-toolchain
cd riscv-gnu-toolchain
git submodule update --init

Build QEMU with the RISC-V target:

cd qemu
git checkout v5.0.0
./configure --target-list=riscv32-softmmu
make -j $(nproc)
sudo make install

Build Linux for the RISC-V target. First, checkout to a desired version:

cd linux
git checkout v5.4.0
make ARCH=riscv CROSS_COMPILE=riscv32-unknown-linux-gnu- defconfig

Then compile the kernel:

make ARCH=riscv CROSS_COMPILE=riscv32-unknown-linux-gnu- -j $(nproc)

Build Busybox:

cd busybox
CROSS_COMPILE=riscv{{bits}}-unknown-linux-gnu- make defconfig
CROSS_COMPILE=riscv{{bits}}-unknown-linux-gnu- make -j $(nproc)

Go back to your main working directory and run:

sudo qemu-system-riscv32 -nographic -machine virt \
     -kernel linux/arch/riscv/boot/Image -append "root=/dev/vda ro console=ttyS0" \
     -drive file=busybox,format=raw,id=hd0 \
     -device virtio-blk-device,drive=hd0

Problem Analyze:
I've encountered insufficient storage space, preventing me from building Linux. After addressing the space issue, I'm uncertain whether the problem lies in version compatibility or environmental configuration, causing issues with the 'make install'

ref:https://risc-v-getting-started-guide.readthedocs.io/en/latest/linux-qemu.html

TEST 2

I found materials on GitHub regarding the RISC-V toolchain installation guide, and I'm attempting to follow the instructions to implement it myself
ubuntu 20.04

Clone this repo:

git clone https://github.com/johnwinans/riscv-toolchain-install-guide.git

Update the sumbodules & checkout the correct versions

./installdeps.sh
./setup.sh

Configure, build, and install the GNU toolchain and qemu

(Note that this can take the better part of an hour to complete!)

./buildall.sh

Add the new tools to your PATH variable by altering your PATH

If you are using bash and installed the tools in the default location then
adding the following to the end of your .bashrc file will suffice:

export PATH=${HOME}/projects/riscv/install/rv32i/bin:${PATH}

Give qemu a basic sanity check

which qemu-system-riscv32
qemu-system-riscv32 --version
qemu-system-riscv32 -machine help

Give the gnu toolchain a basic sanity check

which riscv32-unknown-elf-as
riscv32-unknown-elf-as --version
riscv32-unknown-elf-gcc --version

Problem Analyze:
I still get something issue, like encountered insufficient storage space, preventing me from building Linux,and my therminal window get close can't showoff

ref:https://github.com/johnwinans/riscv-toolchain-install-guide/blob/main/README.md

Test3

I found the page ,Tony Cole provide the recommendations, and installation kit packages
Risc-V Vector 32-bit v0p10 Compile/Link/Run Command Lines and Switches:

Tools Required

Compiler used:

https://buildbot.embecosm.com/job/riscv32-clang-ubuntu1804/54/artifact/riscv32-embecosm-clang-ubuntu1804-20210509.tar.gz

Linker used, this was a special build for me by Embercosm with a full library build (– thank you Embercosm):

https://buildbot.embecosm.com/job/riscv32-gcc-ubuntu1804/60/artifact/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib.tar.gz

QEMU used was Git cloned from (at the time) the latest RISC-V Vector QEMU branch from SiFive:

https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v7-fix

Follow the building instructions in the README.rst.

Compile/Link/Run Switches

LLVM Clang Compile only:

`/data/toolchains/riscv/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-cc -ffunction-sections -fdata-sections -fmacro-backtrace-limit=0 -march=rv32imafcv0p10 -mabi=ilp32f -menable-experimental-extensions -Xclang -target-feature -Xclang -experimental-zvamo -Xclang -target-feature -Xclang -experimental-zvlsseg -O2 -flax-vector-conversions -o filename.obj -c filename.c`

GCC Link:

`/data/tony/Toolchains/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/bin/riscv32-unknown-elf-gcc -ffunction-sections -fdata-sections -march=rv32imafc -mabi=ilp32f -Wl,--gc-sections -o filename.out filename.obj`

QEMU Execute:

`/data/tony/sifive/qemu/build/qemu-riscv32 -s 2048M -p 131072 -cpu rv32,x-v=true filename.out`

I use the GCC Linker to get printf float support in the libraries. If you don’t require library float support, then I think you may be able to just compile and link using Clang.

Sobel

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#define WIDTH 8
#define HEIGHT 8

// Define Sobel operator
int sobel_operator_x[3][3] = {{-1, 0, 1}, {-2, 0, 2}, {-1, 0, 1}};
int sobel_operator_y[3][3] = {{-1, -2, -1}, {0, 0, 0}, {1, 2, 1}};

// Generate a random grayscale image matrix (3 digits)
void generate_random_image(int **image, int width, int height) {
    for (int i = 0; i < height; i++) {
        for (int j = 0; j < width; j++) {
            image[i][j] = rand() % 256;  // Generate a random integer between 0 and 255
        }
    }
}

// Save PGM image file
void save_pgm_image(const char *filename, int **image, int width, int height) {
    FILE *file = fopen(filename, "w");
    if (!file) {
        perror("Error opening file");
        exit(EXIT_FAILURE);
    }

    // Write image header information
    fprintf(file, "P2\n%d %d\n255\n", width, height);

    // Write image data
    for (int i = 0; i < height; i++) {
        for (int j = 0; j < width; j++) {
            fprintf(file, "%*d ", 3, image[i][j]);  // Use a field width of 3 for output
        }
        fprintf(file, "\n");
    }

    fclose(file);
}

// Perform edge detection using the Sobel operator
void sobel_edge_detection(int **input_image, int **output_image, int width, int height) {
    int gx, gy;

    for (int y = 1; y < height - 1; y++) {
        for (int x = 1; x < width - 1; x++) {
            gx = (sobel_operator_x[0][0] * input_image[y - 1][x - 1]) + (sobel_operator_x[0][1] * input_image[y - 1][x]) + (sobel_operator_x[0][2] * input_image[y - 1][x + 1]) +
                 (sobel_operator_x[1][0] * input_image[y][x - 1]) + (sobel_operator_x[1][1] * input_image[y][x]) + (sobel_operator_x[1][2] * input_image[y][x + 1]) +
                 (sobel_operator_x[2][0] * input_image[y + 1][x - 1]) + (sobel_operator_x[2][1] * input_image[y + 1][x]) + (sobel_operator_x[2][2] * input_image[y + 1][x + 1]);

            gy = (sobel_operator_y[0][0] * input_image[y - 1][x - 1]) + (sobel_operator_y[0][1] * input_image[y - 1][x]) + (sobel_operator_y[0][2] * input_image[y - 1][x + 1]) +
                 (sobel_operator_y[1][0] * input_image[y][x - 1]) + (sobel_operator_y[1][1] * input_image[y][x]) + (sobel_operator_y[1][2] * input_image[y][x + 1]) +
                 (sobel_operator_y[2][0] * input_image[y + 1][x - 1]) + (sobel_operator_y[2][1] * input_image[y + 1][x]) + (sobel_operator_y[2][2] * input_image[y + 1][x + 1]);

            // Normalize gradient magnitude
            output_image[y][x] = (int)sqrt((double)(gx * gx + gy * gy)) / 4;  // Divide by 4 for normalization
        }
    }
}

int main() {
    int **input_image, **output_image;

    // Allocate memory and generate a random grayscale value matrix
    input_image = (int **)malloc(sizeof(int *) * HEIGHT);
    for (int i = 0; i < HEIGHT; i++) {
        input_image[i] = (int *)malloc(sizeof(int) * WIDTH);
    }
    generate_random_image(input_image, WIDTH, HEIGHT);

    // Allocate memory for the output image
    output_image = (int **)malloc(sizeof(int *) * HEIGHT);
    for (int i = 0; i < HEIGHT; i++) {
        output_image[i] = (int *)malloc(sizeof(int) * WIDTH);
    }

    // Output the original image to the terminal
    printf("Original Image:\n");
    for (int i = 0; i < HEIGHT; i++) {
        for (int j = 0; j < WIDTH; j++) {
            printf("%3d ", input_image[i][j]);
        }
        printf("\n");
    }
    printf("\n");

    // Perform Sobel edge detection
    sobel_edge_detection(input_image, output_image, WIDTH, HEIGHT);

    // Output the Sobel-processed image to the terminal
    printf("Sobel Result:\n");
    for (int i = 0; i < HEIGHT; i++) {
        for (int j = 0; j < WIDTH; j++) {
            printf("%3d ", output_image[i][j]);
        }
        printf("\n");
    }

    // Free memory
    for (int i = 0; i < HEIGHT; i++) {
        free(input_image[i]);
        free(output_image[i]);
    }
    free(input_image);
    free(output_image);

    return 0;
}

terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-clang -march=rv32ima -mabi=ilp32 -o sobel.elf sobel.c --sysroot=/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf -L/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf/lib -lc -lgloss -lm
terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-size sobel.elf
   text	   data	    bss	    dec	    hex	filename
  25084	    352	     56	  25492	   6394	sobel.elf
terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-readelf -h sobel.elf
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           RISC-V
  Version:                           0x1
  Entry point address:               0x10094
  Start of program headers:          52 (bytes into file)
  Start of section headers:          380812 (bytes into file)
  Flags:                             0x1, RVC, soft-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         3
  Size of section headers:           40 (bytes)
  Number of section headers:         24
  Section header string table index: 23

running at qemu

terry@ubuntu:~/qemu$ qemu-riscv32 ./sobel.elf
Original Image:
 45 207  70  41   4 180 120 216 
104 167 255  63  43 241 252 217 
122 150   9  44 165  87 116 100 
196 175  21  40 164 233  87 219 
 94  32 251  56 168  78 166  20 
147  37  86  36  68 223  89 141 
 67 123 190 144  22 137 157 126 
119 198  47  38 152 136 245 180 

Sobel Result:
  0   0   0   0   0   0   0   0 
  0  58 127  87 146 121  54   0 
  0  69 165  72 125  17  84   0 
  0  76 129 112 112  51  18   0 
  0  67  22  14 117  32  57   0 
  0  45  17  72 103  44  68   0 
  0  69  36  63  70  97  67   0 
  0   0   0   0   0   0   0   0

terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-clang -S -march=rv32ima -mabi=ilp32 -o sobel.S sobel.c

assembly

	.text
	.attribute	4, 16
	.attribute	5, "rv32i2p0_m2p0_a2p0"
	.file	"sobel.c"
	.globl	generate_random_image           # -- Begin function generate_random_image
	.p2align	2
	.type	generate_random_image,@function
generate_random_image:                  # @generate_random_image
# %bb.0:
	addi	sp, sp, -32
	sw	ra, 28(sp)                      # 4-byte Folded Spill
	sw	s0, 24(sp)                      # 4-byte Folded Spill
	addi	s0, sp, 32
	sw	a0, -12(s0)
	sw	a1, -16(s0)
	sw	a2, -20(s0)
	mv	a0, zero
	sw	a0, -24(s0)
	j	.LBB0_1
.LBB0_1:                                # =>This Loop Header: Depth=1
                                        #     Child Loop BB0_3 Depth 2
	lw	a0, -24(s0)
	lw	a1, -20(s0)
	bge	a0, a1, .LBB0_8
	j	.LBB0_2
.LBB0_2:                                #   in Loop: Header=BB0_1 Depth=1
	mv	a0, zero
	sw	a0, -28(s0)
	j	.LBB0_3
.LBB0_3:                                #   Parent Loop BB0_1 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
	lw	a0, -28(s0)
	lw	a1, -16(s0)
	bge	a0, a1, .LBB0_6
	j	.LBB0_4
.LBB0_4:                                #   in Loop: Header=BB0_3 Depth=2
	call	rand
	srai	a1, a0, 31
	srli	a1, a1, 24
	add	a1, a0, a1
	andi	a1, a1, -256
	sub	a0, a0, a1
	lw	a1, -12(s0)
	lw	a2, -24(s0)
	slli	a2, a2, 2
	add	a1, a1, a2
	lw	a1, 0(a1)
	lw	a2, -28(s0)
	slli	a2, a2, 2
	add	a1, a1, a2
	sw	a0, 0(a1)
	j	.LBB0_5
.LBB0_5:                                #   in Loop: Header=BB0_3 Depth=2
	lw	a0, -28(s0)
	addi	a0, a0, 1
	sw	a0, -28(s0)
	j	.LBB0_3
.LBB0_6:                                #   in Loop: Header=BB0_1 Depth=1
	j	.LBB0_7
.LBB0_7:                                #   in Loop: Header=BB0_1 Depth=1
	lw	a0, -24(s0)
	addi	a0, a0, 1
	sw	a0, -24(s0)
	j	.LBB0_1
.LBB0_8:
	lw	s0, 24(sp)                      # 4-byte Folded Reload
	lw	ra, 28(sp)                      # 4-byte Folded Reload
	addi	sp, sp, 32
	ret
.Lfunc_end0:
	.size	generate_random_image, .Lfunc_end0-generate_random_image
                                        # -- End function
	.globl	save_pgm_image                  # -- Begin function save_pgm_image
	.p2align	2
	.type	save_pgm_image,@function
save_pgm_image:                         # @save_pgm_image
# %bb.0:
	addi	sp, sp, -48
	sw	ra, 44(sp)                      # 4-byte Folded Spill
	sw	s0, 40(sp)                      # 4-byte Folded Spill
	addi	s0, sp, 48
	sw	a0, -12(s0)
	sw	a1, -16(s0)
	sw	a2, -20(s0)
	sw	a3, -24(s0)
	lw	a0, -12(s0)
	lui	a1, %hi(.L.str)
	addi	a1, a1, %lo(.L.str)
	call	fopen
	sw	a0, -28(s0)
	lw	a0, -28(s0)
	mv	a1, zero
	bne	a0, a1, .LBB1_2
	j	.LBB1_1
.LBB1_1:
	lui	a0, %hi(.L.str.1)
	addi	a0, a0, %lo(.L.str.1)
	call	perror
	addi	a0, zero, 1
	call	exit
.LBB1_2:
	lw	a0, -28(s0)
	lw	a2, -20(s0)
	lw	a3, -24(s0)
	lui	a1, %hi(.L.str.2)
	addi	a1, a1, %lo(.L.str.2)
	call	fprintf
	mv	a0, zero
	sw	a0, -32(s0)
	j	.LBB1_3
.LBB1_3:                                # =>This Loop Header: Depth=1
                                        #     Child Loop BB1_5 Depth 2
	lw	a0, -32(s0)
	lw	a1, -24(s0)
	bge	a0, a1, .LBB1_10
	j	.LBB1_4
.LBB1_4:                                #   in Loop: Header=BB1_3 Depth=1
	mv	a0, zero
	sw	a0, -36(s0)
	j	.LBB1_5
.LBB1_5:                                #   Parent Loop BB1_3 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
	lw	a0, -36(s0)
	lw	a1, -20(s0)
	bge	a0, a1, .LBB1_8
	j	.LBB1_6
.LBB1_6:                                #   in Loop: Header=BB1_5 Depth=2
	lw	a0, -28(s0)
	lw	a1, -16(s0)
	lw	a2, -32(s0)
	slli	a2, a2, 2
	add	a1, a1, a2
	lw	a1, 0(a1)
	lw	a2, -36(s0)
	slli	a2, a2, 2
	add	a1, a1, a2
	lw	a3, 0(a1)
	lui	a1, %hi(.L.str.3)
	addi	a1, a1, %lo(.L.str.3)
	addi	a2, zero, 3
	call	fprintf
	j	.LBB1_7
.LBB1_7:                                #   in Loop: Header=BB1_5 Depth=2
	lw	a0, -36(s0)
	addi	a0, a0, 1
	sw	a0, -36(s0)
	j	.LBB1_5
.LBB1_8:                                #   in Loop: Header=BB1_3 Depth=1
	lw	a0, -28(s0)
	lui	a1, %hi(.L.str.4)
	addi	a1, a1, %lo(.L.str.4)
	call	fprintf
	j	.LBB1_9
.LBB1_9:                                #   in Loop: Header=BB1_3 Depth=1
	lw	a0, -32(s0)
	addi	a0, a0, 1
	sw	a0, -32(s0)
	j	.LBB1_3
.LBB1_10:
	lw	a0, -28(s0)
	call	fclose
	lw	s0, 40(sp)                      # 4-byte Folded Reload
	lw	ra, 44(sp)                      # 4-byte Folded Reload
	addi	sp, sp, 48
	ret
.Lfunc_end1:
	.size	save_pgm_image, .Lfunc_end1-save_pgm_image
                                        # -- End function
	.globl	sobel_edge_detection            # -- Begin function sobel_edge_detection
	.p2align	2
	.type	sobel_edge_detection,@function
sobel_edge_detection:                   # @sobel_edge_detection
# %bb.0:
	addi	sp, sp, -48
	sw	ra, 44(sp)                      # 4-byte Folded Spill
	sw	s0, 40(sp)                      # 4-byte Folded Spill
	addi	s0, sp, 48
	sw	a0, -12(s0)
	sw	a1, -16(s0)
	sw	a2, -20(s0)
	sw	a3, -24(s0)
	addi	a0, zero, 1
	sw	a0, -36(s0)
	j	.LBB2_1
.LBB2_1:                                # =>This Loop Header: Depth=1
                                        #     Child Loop BB2_3 Depth 2
	lw	a0, -36(s0)
	lw	a1, -24(s0)
	addi	a1, a1, -1
	bge	a0, a1, .LBB2_8
	j	.LBB2_2
.LBB2_2:                                #   in Loop: Header=BB2_1 Depth=1
	addi	a0, zero, 1
	sw	a0, -40(s0)
	j	.LBB2_3
.LBB2_3:                                #   Parent Loop BB2_1 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
	lw	a0, -40(s0)
	lw	a1, -20(s0)
	addi	a1, a1, -1
	bge	a0, a1, .LBB2_6
	j	.LBB2_4
.LBB2_4:                                #   in Loop: Header=BB2_3 Depth=2
	lui	a1, %hi(sobel_operator_x)
	lw	a0, %lo(sobel_operator_x)(a1)
	lw	a2, -12(s0)
	lw	a3, -36(s0)
	slli	a3, a3, 2
	add	a2, a2, a3
	lw	a7, -4(a2)
	lw	a3, -40(s0)
	slli	a5, a3, 2
	addi	a6, a5, -4
	add	a3, a7, a6
	lw	a3, 0(a3)
	mul	a0, a0, a3
	addi	a1, a1, %lo(sobel_operator_x)
	sw	a1, -48(s0)                     # 4-byte Folded Spill
	lw	a3, 4(a1)
	add	a4, a7, a5
	lw	a4, 0(a4)
	mul	a3, a3, a4
	add	a0, a0, a3
	lw	a4, 8(a1)
	addi	a3, a5, 4
	add	a7, a7, a3
	lw	a7, 0(a7)
	mul	a4, a4, a7
	add	a0, a0, a4
	lw	a4, 12(a1)
	lw	a7, 0(a2)
	add	t0, a7, a6
	lw	t0, 0(t0)
	mul	a4, a4, t0
	add	a0, a0, a4
	lw	a4, 16(a1)
	add	t0, a7, a5
	lw	t0, 0(t0)
	mul	a4, a4, t0
	add	a0, a0, a4
	lw	a4, 20(a1)
	add	a7, a7, a3
	lw	a7, 0(a7)
	mul	a4, a4, a7
	add	a0, a0, a4
	lw	a4, 24(a1)
	lw	a2, 4(a2)
	add	a6, a2, a6
	lw	a6, 0(a6)
	mul	a4, a4, a6
	add	a0, a0, a4
	lw	a4, 28(a1)
	add	a5, a2, a5
	lw	a5, 0(a5)
	mul	a4, a4, a5
	add	a0, a0, a4
	lw	a1, 32(a1)
	add	a2, a2, a3
	lw	a2, 0(a2)
	mul	a1, a1, a2
	add	a0, a0, a1
	sw	a0, -28(s0)
	lui	a1, %hi(sobel_operator_y)
	lw	a0, %lo(sobel_operator_y)(a1)
	lw	a2, -12(s0)
	lw	a3, -36(s0)
	slli	a3, a3, 2
	add	a2, a2, a3
	lw	a7, -4(a2)
	lw	a3, -40(s0)
	slli	a5, a3, 2
	addi	a6, a5, -4
	add	a3, a7, a6
	lw	a3, 0(a3)
	mul	a0, a0, a3
	addi	a1, a1, %lo(sobel_operator_y)
	sw	a1, -44(s0)                     # 4-byte Folded Spill
	lw	a3, 4(a1)
	add	a4, a7, a5
	lw	a4, 0(a4)
	mul	a3, a3, a4
	add	a0, a0, a3
	lw	a4, 8(a1)
	addi	a3, a5, 4
	add	a7, a7, a3
	lw	a7, 0(a7)
	mul	a4, a4, a7
	add	a0, a0, a4
	lw	a4, 12(a1)
	lw	a7, 0(a2)
	add	t0, a7, a6
	lw	t0, 0(t0)
	mul	a4, a4, t0
	add	a0, a0, a4
	lw	a4, 16(a1)
	add	t0, a7, a5
	lw	t0, 0(t0)
	mul	a4, a4, t0
	add	a0, a0, a4
	lw	a4, 20(a1)
	add	a7, a7, a3
	lw	a7, 0(a7)
	mul	a4, a4, a7
	add	a0, a0, a4
	lw	a4, 24(a1)
	lw	a2, 4(a2)
	add	a6, a2, a6
	lw	a6, 0(a6)
	mul	a4, a4, a6
	add	a0, a0, a4
	lw	a4, 28(a1)
	add	a5, a2, a5
	lw	a5, 0(a5)
	mul	a4, a4, a5
	add	a0, a0, a4
	lw	a1, 32(a1)
	add	a2, a2, a3
	lw	a2, 0(a2)
	mul	a1, a1, a2
	add	a0, a0, a1
	sw	a0, -32(s0)
	lw	a0, -28(s0)
	mul	a0, a0, a0
	lw	a1, -32(s0)
	mul	a1, a1, a1
	add	a0, a0, a1
	call	__floatsidf@plt
	call	sqrt
	call	__fixdfsi@plt
	srai	a1, a0, 31
	srli	a1, a1, 30
	add	a0, a0, a1
	srai	a0, a0, 2
	lw	a1, -16(s0)
	lw	a2, -36(s0)
	slli	a2, a2, 2
	add	a1, a1, a2
	lw	a1, 0(a1)
	lw	a2, -40(s0)
	slli	a2, a2, 2
	add	a1, a1, a2
	sw	a0, 0(a1)
	j	.LBB2_5
.LBB2_5:                                #   in Loop: Header=BB2_3 Depth=2
	lw	a0, -40(s0)
	addi	a0, a0, 1
	sw	a0, -40(s0)
	j	.LBB2_3
.LBB2_6:                                #   in Loop: Header=BB2_1 Depth=1
	j	.LBB2_7
.LBB2_7:                                #   in Loop: Header=BB2_1 Depth=1
	lw	a0, -36(s0)
	addi	a0, a0, 1
	sw	a0, -36(s0)
	j	.LBB2_1
.LBB2_8:
	lw	s0, 40(sp)                      # 4-byte Folded Reload
	lw	ra, 44(sp)                      # 4-byte Folded Reload
	addi	sp, sp, 48
	ret
.Lfunc_end2:
	.size	sobel_edge_detection, .Lfunc_end2-sobel_edge_detection
                                        # -- End function
	.globl	main                            # -- Begin function main
	.p2align	2
	.type	main,@function
main:                                   # @main
# %bb.0:
	addi	sp, sp, -48
	sw	ra, 44(sp)                      # 4-byte Folded Spill
	sw	s0, 40(sp)                      # 4-byte Folded Spill
	addi	s0, sp, 48
	mv	a0, zero
	sw	a0, -36(s0)                     # 4-byte Folded Spill
	sw	a0, -12(s0)
	addi	a0, zero, 128
	call	malloc
	mv	a1, a0
	lw	a0, -36(s0)                     # 4-byte Folded Reload
	sw	a1, -16(s0)
	sw	a0, -24(s0)
	j	.LBB3_1
.LBB3_1:                                # =>This Inner Loop Header: Depth=1
	lw	a1, -24(s0)
	addi	a0, zero, 31
	blt	a0, a1, .LBB3_4
	j	.LBB3_2
.LBB3_2:                                #   in Loop: Header=BB3_1 Depth=1
	addi	a0, zero, 128
	call	malloc
	lw	a1, -16(s0)
	lw	a2, -24(s0)
	slli	a2, a2, 2
	add	a1, a1, a2
	sw	a0, 0(a1)
	j	.LBB3_3
.LBB3_3:                                #   in Loop: Header=BB3_1 Depth=1
	lw	a0, -24(s0)
	addi	a0, a0, 1
	sw	a0, -24(s0)
	j	.LBB3_1
.LBB3_4:
	lw	a0, -16(s0)
	addi	a2, zero, 32
	mv	a1, a2
	call	generate_random_image
	addi	a0, zero, 128
	call	malloc
	sw	a0, -20(s0)
	mv	a0, zero
	sw	a0, -28(s0)
	j	.LBB3_5
.LBB3_5:                                # =>This Inner Loop Header: Depth=1
	lw	a1, -28(s0)
	addi	a0, zero, 31
	blt	a0, a1, .LBB3_8
	j	.LBB3_6
.LBB3_6:                                #   in Loop: Header=BB3_5 Depth=1
	addi	a0, zero, 128
	call	malloc
	lw	a1, -20(s0)
	lw	a2, -28(s0)
	slli	a2, a2, 2
	add	a1, a1, a2
	sw	a0, 0(a1)
	j	.LBB3_7
.LBB3_7:                                #   in Loop: Header=BB3_5 Depth=1
	lw	a0, -28(s0)
	addi	a0, a0, 1
	sw	a0, -28(s0)
	j	.LBB3_5
.LBB3_8:
	lw	a1, -16(s0)
	lui	a0, %hi(.L.str.5)
	addi	a0, a0, %lo(.L.str.5)
	addi	a3, zero, 32
	sw	a3, -40(s0)                     # 4-byte Folded Spill
	mv	a2, a3
	call	save_pgm_image
	lw	a3, -40(s0)                     # 4-byte Folded Reload
	lw	a0, -16(s0)
	lw	a1, -20(s0)
	mv	a2, a3
	call	sobel_edge_detection
	lw	a3, -40(s0)                     # 4-byte Folded Reload
	lw	a1, -20(s0)
	lui	a0, %hi(.L.str.6)
	addi	a0, a0, %lo(.L.str.6)
	mv	a2, a3
	call	save_pgm_image
	mv	a0, zero
	sw	a0, -32(s0)
	j	.LBB3_9
.LBB3_9:                                # =>This Inner Loop Header: Depth=1
	lw	a1, -32(s0)
	addi	a0, zero, 31
	blt	a0, a1, .LBB3_12
	j	.LBB3_10
.LBB3_10:                               #   in Loop: Header=BB3_9 Depth=1
	lw	a0, -16(s0)
	lw	a1, -32(s0)
	slli	a1, a1, 2
	add	a0, a0, a1
	lw	a0, 0(a0)
	call	free
	lw	a0, -20(s0)
	lw	a1, -32(s0)
	slli	a1, a1, 2
	add	a0, a0, a1
	lw	a0, 0(a0)
	call	free
	j	.LBB3_11
.LBB3_11:                               #   in Loop: Header=BB3_9 Depth=1
	lw	a0, -32(s0)
	addi	a0, a0, 1
	sw	a0, -32(s0)
	j	.LBB3_9
.LBB3_12:
	lw	a0, -16(s0)
	call	free
	lw	a0, -20(s0)
	call	free
	mv	a0, zero
	lw	s0, 40(sp)                      # 4-byte Folded Reload
	lw	ra, 44(sp)                      # 4-byte Folded Reload
	addi	sp, sp, 48
	ret
.Lfunc_end3:
	.size	main, .Lfunc_end3-main
                                        # -- End function
	.type	sobel_operator_x,@object        # @sobel_operator_x
	.data
	.globl	sobel_operator_x
	.p2align	2
sobel_operator_x:
	.word	4294967295                      # 0xffffffff
	.word	0                               # 0x0
	.word	1                               # 0x1
	.word	4294967294                      # 0xfffffffe
	.word	0                               # 0x0
	.word	2                               # 0x2
	.word	4294967295                      # 0xffffffff
	.word	0                               # 0x0
	.word	1                               # 0x1
	.size	sobel_operator_x, 36

	.type	sobel_operator_y,@object        # @sobel_operator_y
	.globl	sobel_operator_y
	.p2align	2
sobel_operator_y:
	.word	4294967295                      # 0xffffffff
	.word	4294967294                      # 0xfffffffe
	.word	4294967295                      # 0xffffffff
	.zero	12
	.word	1                               # 0x1
	.word	2                               # 0x2
	.word	1                               # 0x1
	.size	sobel_operator_y, 36

	.type	.L.str,@object                  # @.str
	.section	.rodata.str1.1,"aMS",@progbits,1
.L.str:
	.asciz	"w"
	.size	.L.str, 2

	.type	.L.str.1,@object                # @.str.1
.L.str.1:
	.asciz	"Error opening file"
	.size	.L.str.1, 19

	.type	.L.str.2,@object                # @.str.2
.L.str.2:
	.asciz	"P2\n%d %d\n255\n"
	.size	.L.str.2, 14

	.type	.L.str.3,@object                # @.str.3
.L.str.3:
	.asciz	"%*d "
	.size	.L.str.3, 5

	.type	.L.str.4,@object                # @.str.4
.L.str.4:
	.asciz	"\n"
	.size	.L.str.4, 2

	.type	.L.str.5,@object                # @.str.5
.L.str.5:
	.asciz	"original.pgm"
	.size	.L.str.5, 13

	.type	.L.str.6,@object                # @.str.6
.L.str.6:
	.asciz	"sobel_result.pgm"
	.size	.L.str.6, 17

	.ident	"riscv32-embecosm-clang-ubuntu1804-20210509 clang version 13.0.0 (https://mirrors.git.embecosm.com/mirrors/llvm-project.git 4aec8f4ce0f564aa68c23b9e29c2e3a945eec947)"
	.section	".note.GNU-stack","",@progbits
	.addrsig
	.addrsig_sym generate_random_image
	.addrsig_sym rand
	.addrsig_sym save_pgm_image
	.addrsig_sym fopen
	.addrsig_sym perror
	.addrsig_sym exit
	.addrsig_sym fprintf
	.addrsig_sym fclose
	.addrsig_sym sobel_edge_detection
	.addrsig_sym sqrt
	.addrsig_sym malloc
	.addrsig_sym free
	.addrsig_sym sobel_operator_x
	.addrsig_sym sobel_operator_y

Gaussian Blur

#include <stdio.h>

#define WIDTH 5
#define HEIGHT 5

// Image data
int image[WIDTH][HEIGHT] = {
    {120, 50, 200, 30, 80},
    {90, 180, 60, 40, 140},
    {70, 20, 110, 10, 160},
    {130, 100, 150, 190, 220},
    {30, 80, 120, 50, 200}
};

// Gaussian kernel
float gaussian_kernel[3][3] = {
    {1.0 / 16, 2.0 / 16, 1.0 / 16},
    {2.0 / 16, 4.0 / 16, 2.0 / 16},
    {1.0 / 16, 2.0 / 16, 1.0 / 16}
};

// Gaussian blur function
void gaussian_blur() {
    int blurred_image[WIDTH][HEIGHT];

    for (int i = 1; i < WIDTH - 1; i++) {
        for (int j = 1; j < HEIGHT - 1; j++) {
            // Perform convolution
            float sum = 0.0;
            for (int m = -1; m <= 1; m++) {
                for (int n = -1; n <= 1; n++) {
                    sum += image[i + m][j + n] * gaussian_kernel[m + 1][n + 1];
                }
            }

            // Round and store the result
            blurred_image[i][j] = (int)(sum + 0.5);
        }
    }

    // Output the original image
    printf("Original Image:\n");
    for (int i = 0; i < WIDTH; i++) {
        for (int j = 0; j < HEIGHT; j++) {
            printf("%d ", image[i][j]);
        }
        printf("\n");
    }

    // Output the image after Gaussian blur
    printf("\nBlurred Image:\n");
    for (int i = 0; i < WIDTH; i++) {
        for (int j = 0; j < HEIGHT; j++) {
            printf("%d ", blurred_image[i][j]);
        }
        printf("\n");
    }
}

int main() {
    // Perform Gaussian blur
    gaussian_blur();

    return 0;
}

terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-clang -march=rv32ima -mabi=ilp32 -o gaussian.elf gaussian.c --sysroot=/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf -L/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf/lib -lc -lgloss -lm
terry@ubuntu:~/project$  ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-size gaussian.elf
   text	   data	    bss	    dec	    hex	filename
  15322	    416	     56	  15794	   3db2	gaussian.elf
terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-readelf -h gaussian.elf
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           RISC-V
  Version:                           0x1
  Entry point address:               0x10094
  Start of program headers:          52 (bytes into file)
  Start of section headers:          266440 (bytes into file)
  Flags:                             0x1, RVC, soft-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         3
  Size of section headers:           40 (bytes)
  Number of section headers:         24
  Section header string table index: 23

running at qemu

terry@ubuntu:~/qemu$ qemu-riscv32 ./gaussian.elf
Original Image:
120 50 200 30 80 
90 180 60 40 140 
70 20 110 10 160 
130 100 150 190 220 
30 80 120 50 200 

Blurred Image:
0 0 0 0 0 
0 104 88 74 0 
0 89 89 101 0 
0 93 113 138 0 
0 0 0 0 0

terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-clang -S -march=rv32ima -mabi=ilp32 -o gaussian.S gaussian.c

assembly

	.text
	.attribute	4, 16
	.attribute	5, "rv32i2p0_m2p0_a2p0"
	.file	"gaussian.c"
	.globl	gaussian_blur                   # -- Begin function gaussian_blur
	.p2align	2
	.type	gaussian_blur,@function
gaussian_blur:                          # @gaussian_blur
# %bb.0:
	addi	sp, sp, -160
	sw	ra, 156(sp)                     # 4-byte Folded Spill
	sw	s0, 152(sp)                     # 4-byte Folded Spill
	addi	s0, sp, 160
	addi	a0, zero, 1
	sw	a0, -112(s0)
	j	.LBB0_1
.LBB0_1:                                # =>This Loop Header: Depth=1
                                        #     Child Loop BB0_3 Depth 2
                                        #       Child Loop BB0_5 Depth 3
                                        #         Child Loop BB0_7 Depth 4
	lw	a1, -112(s0)
	addi	a0, zero, 3
	blt	a0, a1, .LBB0_16
	j	.LBB0_2
.LBB0_2:                                #   in Loop: Header=BB0_1 Depth=1
	addi	a0, zero, 1
	sw	a0, -116(s0)
	j	.LBB0_3
.LBB0_3:                                #   Parent Loop BB0_1 Depth=1
                                        # =>  This Loop Header: Depth=2
                                        #       Child Loop BB0_5 Depth 3
                                        #         Child Loop BB0_7 Depth 4
	lw	a1, -116(s0)
	addi	a0, zero, 3
	blt	a0, a1, .LBB0_14
	j	.LBB0_4
.LBB0_4:                                #   in Loop: Header=BB0_3 Depth=2
	mv	a0, zero
	sw	a0, -120(s0)
	addi	a0, zero, -1
	sw	a0, -124(s0)
	j	.LBB0_5
.LBB0_5:                                #   Parent Loop BB0_1 Depth=1
                                        #     Parent Loop BB0_3 Depth=2
                                        # =>    This Loop Header: Depth=3
                                        #         Child Loop BB0_7 Depth 4
	lw	a1, -124(s0)
	addi	a0, zero, 1
	blt	a0, a1, .LBB0_12
	j	.LBB0_6
.LBB0_6:                                #   in Loop: Header=BB0_5 Depth=3
	addi	a0, zero, -1
	sw	a0, -128(s0)
	j	.LBB0_7
.LBB0_7:                                #   Parent Loop BB0_1 Depth=1
                                        #     Parent Loop BB0_3 Depth=2
                                        #       Parent Loop BB0_5 Depth=3
                                        # =>      This Inner Loop Header: Depth=4
	lw	a1, -128(s0)
	addi	a0, zero, 1
	blt	a0, a1, .LBB0_10
	j	.LBB0_8
.LBB0_8:                                #   in Loop: Header=BB0_7 Depth=4
	lw	a0, -112(s0)
	lw	a1, -124(s0)
	sw	a1, -152(s0)                    # 4-byte Folded Spill
	add	a0, a0, a1
	addi	a1, zero, 20
	mul	a0, a0, a1
	lw	a1, -116(s0)
	lw	a2, -128(s0)
	sw	a2, -148(s0)                    # 4-byte Folded Spill
	add	a1, a1, a2
	slli	a1, a1, 2
	add	a0, a0, a1
	lui	a1, %hi(image)
	addi	a1, a1, %lo(image)
	add	a0, a0, a1
	lw	a0, 0(a0)
	call	__floatsisf@plt
	lw	a1, -152(s0)                    # 4-byte Folded Reload
	lw	a2, -148(s0)                    # 4-byte Folded Reload
	addi	a3, zero, 12
	mul	a1, a1, a3
	slli	a2, a2, 2
	add	a2, a1, a2
	lui	a1, %hi(gaussian_kernel)
	addi	a1, a1, %lo(gaussian_kernel)
	add	a1, a1, a2
	lw	a1, 16(a1)
	call	__mulsf3@plt
	mv	a1, a0
	lw	a0, -120(s0)
	call	__addsf3@plt
	sw	a0, -120(s0)
	j	.LBB0_9
.LBB0_9:                                #   in Loop: Header=BB0_7 Depth=4
	lw	a0, -128(s0)
	addi	a0, a0, 1
	sw	a0, -128(s0)
	j	.LBB0_7
.LBB0_10:                               #   in Loop: Header=BB0_5 Depth=3
	j	.LBB0_11
.LBB0_11:                               #   in Loop: Header=BB0_5 Depth=3
	lw	a0, -124(s0)
	addi	a0, a0, 1
	sw	a0, -124(s0)
	j	.LBB0_5
.LBB0_12:                               #   in Loop: Header=BB0_3 Depth=2
	lw	a0, -120(s0)
	call	__extendsfdf2@plt
	mv	a2, zero
	lui	a3, 261632
	call	__adddf3@plt
	call	__fixdfsi@plt
	lw	a1, -112(s0)
	addi	a2, zero, 20
	mul	a2, a1, a2
	addi	a1, s0, -108
	add	a1, a1, a2
	lw	a2, -116(s0)
	slli	a2, a2, 2
	add	a1, a1, a2
	sw	a0, 0(a1)
	j	.LBB0_13
.LBB0_13:                               #   in Loop: Header=BB0_3 Depth=2
	lw	a0, -116(s0)
	addi	a0, a0, 1
	sw	a0, -116(s0)
	j	.LBB0_3
.LBB0_14:                               #   in Loop: Header=BB0_1 Depth=1
	j	.LBB0_15
.LBB0_15:                               #   in Loop: Header=BB0_1 Depth=1
	lw	a0, -112(s0)
	addi	a0, a0, 1
	sw	a0, -112(s0)
	j	.LBB0_1
.LBB0_16:
	lui	a0, %hi(.L.str)
	addi	a0, a0, %lo(.L.str)
	call	printf
	mv	a0, zero
	sw	a0, -132(s0)
	j	.LBB0_17
.LBB0_17:                               # =>This Loop Header: Depth=1
                                        #     Child Loop BB0_19 Depth 2
	lw	a1, -132(s0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_24
	j	.LBB0_18
.LBB0_18:                               #   in Loop: Header=BB0_17 Depth=1
	mv	a0, zero
	sw	a0, -136(s0)
	j	.LBB0_19
.LBB0_19:                               #   Parent Loop BB0_17 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
	lw	a1, -136(s0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_22
	j	.LBB0_20
.LBB0_20:                               #   in Loop: Header=BB0_19 Depth=2
	lw	a0, -132(s0)
	addi	a1, zero, 20
	mul	a0, a0, a1
	lw	a1, -136(s0)
	slli	a1, a1, 2
	add	a0, a0, a1
	lui	a1, %hi(image)
	addi	a1, a1, %lo(image)
	add	a0, a0, a1
	lw	a1, 0(a0)
	lui	a0, %hi(.L.str.1)
	addi	a0, a0, %lo(.L.str.1)
	call	printf
	j	.LBB0_21
.LBB0_21:                               #   in Loop: Header=BB0_19 Depth=2
	lw	a0, -136(s0)
	addi	a0, a0, 1
	sw	a0, -136(s0)
	j	.LBB0_19
.LBB0_22:                               #   in Loop: Header=BB0_17 Depth=1
	lui	a0, %hi(.L.str.2)
	addi	a0, a0, %lo(.L.str.2)
	call	printf
	j	.LBB0_23
.LBB0_23:                               #   in Loop: Header=BB0_17 Depth=1
	lw	a0, -132(s0)
	addi	a0, a0, 1
	sw	a0, -132(s0)
	j	.LBB0_17
.LBB0_24:
	lui	a0, %hi(.L.str.3)
	addi	a0, a0, %lo(.L.str.3)
	call	printf
	mv	a0, zero
	sw	a0, -140(s0)
	j	.LBB0_25
.LBB0_25:                               # =>This Loop Header: Depth=1
                                        #     Child Loop BB0_27 Depth 2
	lw	a1, -140(s0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_32
	j	.LBB0_26
.LBB0_26:                               #   in Loop: Header=BB0_25 Depth=1
	mv	a0, zero
	sw	a0, -144(s0)
	j	.LBB0_27
.LBB0_27:                               #   Parent Loop BB0_25 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
	lw	a1, -144(s0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_30
	j	.LBB0_28
.LBB0_28:                               #   in Loop: Header=BB0_27 Depth=2
	lw	a0, -140(s0)
	addi	a1, zero, 20
	mul	a1, a0, a1
	addi	a0, s0, -108
	add	a0, a0, a1
	lw	a1, -144(s0)
	slli	a1, a1, 2
	add	a0, a0, a1
	lw	a1, 0(a0)
	lui	a0, %hi(.L.str.1)
	addi	a0, a0, %lo(.L.str.1)
	call	printf
	j	.LBB0_29
.LBB0_29:                               #   in Loop: Header=BB0_27 Depth=2
	lw	a0, -144(s0)
	addi	a0, a0, 1
	sw	a0, -144(s0)
	j	.LBB0_27
.LBB0_30:                               #   in Loop: Header=BB0_25 Depth=1
	lui	a0, %hi(.L.str.2)
	addi	a0, a0, %lo(.L.str.2)
	call	printf
	j	.LBB0_31
.LBB0_31:                               #   in Loop: Header=BB0_25 Depth=1
	lw	a0, -140(s0)
	addi	a0, a0, 1
	sw	a0, -140(s0)
	j	.LBB0_25
.LBB0_32:
	lw	s0, 152(sp)                     # 4-byte Folded Reload
	lw	ra, 156(sp)                     # 4-byte Folded Reload
	addi	sp, sp, 160
	ret
.Lfunc_end0:
	.size	gaussian_blur, .Lfunc_end0-gaussian_blur
                                        # -- End function
	.globl	main                            # -- Begin function main
	.p2align	2
	.type	main,@function
main:                                   # @main
# %bb.0:
	addi	sp, sp, -16
	sw	ra, 12(sp)                      # 4-byte Folded Spill
	sw	s0, 8(sp)                       # 4-byte Folded Spill
	addi	s0, sp, 16
	mv	a0, zero
	sw	a0, -16(s0)                     # 4-byte Folded Spill
	sw	a0, -12(s0)
	call	gaussian_blur
	lw	a0, -16(s0)                     # 4-byte Folded Reload
	lw	s0, 8(sp)                       # 4-byte Folded Reload
	lw	ra, 12(sp)                      # 4-byte Folded Reload
	addi	sp, sp, 16
	ret
.Lfunc_end1:
	.size	main, .Lfunc_end1-main
                                        # -- End function
	.type	image,@object                   # @image
	.data
	.globl	image
	.p2align	2
image:
	.word	120                             # 0x78
	.word	50                              # 0x32
	.word	200                             # 0xc8
	.word	30                              # 0x1e
	.word	80                              # 0x50
	.word	90                              # 0x5a
	.word	180                             # 0xb4
	.word	60                              # 0x3c
	.word	40                              # 0x28
	.word	140                             # 0x8c
	.word	70                              # 0x46
	.word	20                              # 0x14
	.word	110                             # 0x6e
	.word	10                              # 0xa
	.word	160                             # 0xa0
	.word	130                             # 0x82
	.word	100                             # 0x64
	.word	150                             # 0x96
	.word	190                             # 0xbe
	.word	220                             # 0xdc
	.word	30                              # 0x1e
	.word	80                              # 0x50
	.word	120                             # 0x78
	.word	50                              # 0x32
	.word	200                             # 0xc8
	.size	image, 100

	.type	gaussian_kernel,@object         # @gaussian_kernel
	.globl	gaussian_kernel
	.p2align	2
gaussian_kernel:
	.word	0x3d800000                      # float 0.0625
	.word	0x3e000000                      # float 0.125
	.word	0x3d800000                      # float 0.0625
	.word	0x3e000000                      # float 0.125
	.word	0x3e800000                      # float 0.25
	.word	0x3e000000                      # float 0.125
	.word	0x3d800000                      # float 0.0625
	.word	0x3e000000                      # float 0.125
	.word	0x3d800000                      # float 0.0625
	.size	gaussian_kernel, 36

	.type	.L.str,@object                  # @.str
	.section	.rodata.str1.1,"aMS",@progbits,1
.L.str:
	.asciz	"Original Image:\n"
	.size	.L.str, 17

	.type	.L.str.1,@object                # @.str.1
.L.str.1:
	.asciz	"%d "
	.size	.L.str.1, 4

	.type	.L.str.2,@object                # @.str.2
.L.str.2:
	.asciz	"\n"
	.size	.L.str.2, 2

	.type	.L.str.3,@object                # @.str.3
.L.str.3:
	.asciz	"\nBlurred Image:\n"
	.size	.L.str.3, 17

	.ident	"riscv32-embecosm-clang-ubuntu1804-20210509 clang version 13.0.0 (https://mirrors.git.embecosm.com/mirrors/llvm-project.git 4aec8f4ce0f564aa68c23b9e29c2e3a945eec947)"
	.section	".note.GNU-stack","",@progbits
	.addrsig
	.addrsig_sym gaussian_blur
	.addrsig_sym printf
	.addrsig_sym image
	.addrsig_sym gaussian_kernel

Histogram Equalization

#include <stdio.h>
#include <stdlib.h>

#define WIDTH 5
#define HEIGHT 5
#define MAX_PIXEL_VALUE 255

// Image data
int image[WIDTH][HEIGHT] = {
    {120, 50, 200, 30, 80},
    {90, 180, 60, 40, 140},
    {70, 20, 110, 10, 160},
    {130, 100, 150, 190, 220},
    {30, 80, 120, 50, 200}
};

// Function for histogram equalization
void histogram_equalization() {
    // Compute the histogram
    int histogram[MAX_PIXEL_VALUE + 1] = {0};
    for (int i = 0; i < WIDTH; i++) {
        for (int j = 0; j < HEIGHT; j++) {
            histogram[image[i][j]]++;
        }
    }

    // Compute the cumulative histogram
    int cumulative_histogram[MAX_PIXEL_VALUE + 1] = {0};
    cumulative_histogram[0] = histogram[0];
    for (int i = 1; i <= MAX_PIXEL_VALUE; i++) {
        cumulative_histogram[i] = cumulative_histogram[i - 1] + histogram[i];
    }

    // Compute the equalized pixel values
    int equalized_image[WIDTH][HEIGHT];
    for (int i = 0; i < WIDTH; i++) {
        for (int j = 0; j < HEIGHT; j++) {
            equalized_image[i][j] = (int)(((float)cumulative_histogram[image[i][j]] / (WIDTH * HEIGHT)) * MAX_PIXEL_VALUE);
        }
    }

    // Output the original image
    printf("Original Image:\n");
    for (int i = 0; i < WIDTH; i++) {
        for (int j = 0; j < HEIGHT; j++) {
            printf("%d ", image[i][j]);
        }
        printf("\n");
    }

    // Output the equalized image
    printf("\nEqualized Image:\n");
    for (int i = 0; i < WIDTH; i++) {
        for (int j = 0; j < HEIGHT; j++) {
            printf("%d ", equalized_image[i][j]);
        }
        printf("\n");
    }
}

int main() {
    // Perform histogram equalization
    histogram_equalization();

    return 0;
}

terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-clang -march=rv32ima -mabi=ilp32 -o histogram.elf histogram.c --sysroot=/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf -L/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf/lib -lc -lgloss -lm
terry@ubuntu:~/project$  ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-size histogram.elf
   text	   data	    bss	    dec	    hex	filename
  12646	    380	     56	  13082	   331a	histogram.elf
terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-readelf -h histogram.elf 
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           RISC-V
  Version:                           0x1
  Entry point address:               0x10094
  Start of program headers:          52 (bytes into file)
  Start of section headers:          263356 (bytes into file)
  Flags:                             0x1, RVC, soft-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         3
  Size of section headers:           40 (bytes)
  Number of section headers:         24
  Section header string table index: 23

running at qemu

terry@ubuntu:~/qemu$ qemu-riscv32 ./histogram.elf
Original Image:
120 50 200 30 80 
90 180 60 40 140 
70 20 110 10 160 
130 100 150 190 220 
30 80 120 50 200 

Equalized Image:
163 71 244 40 112 
122 214 81 51 183 
91 20 142 10 204 
173 132 193 224 255 
40 112 163 71 244

terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-clang -S -march=rv32ima -mabi=ilp32 -o histogram.S histogram.c

assembly

	.text
	.attribute	4, 16
	.attribute	5, "rv32i2p0_m2p0_a2p0"
	.file	"histogram.c"
	.globl	histogram_equalization          # -- Begin function histogram_equalization
	.p2align	2
	.type	histogram_equalization,@function
histogram_equalization:                 # @histogram_equalization
# %bb.0:
	addi	sp, sp, -2032
	sw	ra, 2028(sp)                    # 4-byte Folded Spill
	sw	s0, 2024(sp)                    # 4-byte Folded Spill
	addi	s0, sp, 2032
	addi	sp, sp, -176
	addi	a0, s0, -1036
	mv	a1, zero
	lui	a2, 1048575
	addi	a2, a2, 1896
	add	a2, s0, a2
	sw	a1, 0(a2)                       # 4-byte Folded Spill
	addi	a2, zero, 1024
	call	memset@plt
                                        # kill: def $x11 killed $x10
	lui	a0, 1048575
	addi	a0, a0, 1896
	add	a0, s0, a0
	lw	a0, 0(a0)                       # 4-byte Folded Reload
	sw	a0, -1040(s0)
	j	.LBB0_1
.LBB0_1:                                # =>This Loop Header: Depth=1
                                        #     Child Loop BB0_3 Depth 2
	lw	a1, -1040(s0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_8
	j	.LBB0_2
.LBB0_2:                                #   in Loop: Header=BB0_1 Depth=1
	mv	a0, zero
	sw	a0, -1044(s0)
	j	.LBB0_3
.LBB0_3:                                #   Parent Loop BB0_1 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
	lw	a1, -1044(s0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_6
	j	.LBB0_4
.LBB0_4:                                #   in Loop: Header=BB0_3 Depth=2
	lw	a0, -1040(s0)
	addi	a1, zero, 20
	mul	a0, a0, a1
	lw	a1, -1044(s0)
	slli	a1, a1, 2
	add	a0, a0, a1
	lui	a1, %hi(image)
	addi	a1, a1, %lo(image)
	add	a0, a0, a1
	lw	a0, 0(a0)
	slli	a1, a0, 2
	addi	a0, s0, -1036
	add	a1, a0, a1
	lw	a0, 0(a1)
	addi	a0, a0, 1
	sw	a0, 0(a1)
	j	.LBB0_5
.LBB0_5:                                #   in Loop: Header=BB0_3 Depth=2
	lw	a0, -1044(s0)
	addi	a0, a0, 1
	sw	a0, -1044(s0)
	j	.LBB0_3
.LBB0_6:                                #   in Loop: Header=BB0_1 Depth=1
	j	.LBB0_7
.LBB0_7:                                #   in Loop: Header=BB0_1 Depth=1
	lw	a0, -1040(s0)
	addi	a0, a0, 1
	sw	a0, -1040(s0)
	j	.LBB0_1
.LBB0_8:
	lui	a0, 1048575
	addi	a0, a0, 2028
	add	a0, s0, a0
	mv	a1, zero
	addi	a2, zero, 1024
	call	memset@plt
	lw	a0, -1036(s0)
	lui	a1, 1048575
	addi	a1, a1, 2028
	add	a1, s0, a1
	sw	a0, 0(a1)
	addi	a0, zero, 1
	lui	a1, 1048575
	addi	a1, a1, 2024
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_9
.LBB0_9:                                # =>This Inner Loop Header: Depth=1
	lui	a0, 1048575
	addi	a0, a0, 2024
	add	a0, s0, a0
	lw	a1, 0(a0)
	addi	a0, zero, 255
	blt	a0, a1, .LBB0_12
	j	.LBB0_10
.LBB0_10:                               #   in Loop: Header=BB0_9 Depth=1
	lui	a0, 1048575
	addi	a0, a0, 2024
	add	a0, s0, a0
	lw	a0, 0(a0)
	slli	a3, a0, 2
	lui	a0, 1048575
	addi	a0, a0, 2028
	add	a0, s0, a0
	add	a1, a0, a3
	lw	a0, -4(a1)
	addi	a2, s0, -1036
	add	a2, a2, a3
	lw	a2, 0(a2)
	add	a0, a0, a2
	sw	a0, 0(a1)
	j	.LBB0_11
.LBB0_11:                               #   in Loop: Header=BB0_9 Depth=1
	lui	a0, 1048575
	addi	a0, a0, 2024
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a0, a0, 1
	lui	a1, 1048575
	addi	a1, a1, 2024
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_9
.LBB0_12:
	mv	a0, zero
	lui	a1, 1048575
	addi	a1, a1, 1920
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_13
.LBB0_13:                               # =>This Loop Header: Depth=1
                                        #     Child Loop BB0_15 Depth 2
	lui	a0, 1048575
	addi	a0, a0, 1920
	add	a0, s0, a0
	lw	a1, 0(a0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_20
	j	.LBB0_14
.LBB0_14:                               #   in Loop: Header=BB0_13 Depth=1
	mv	a0, zero
	lui	a1, 1048575
	addi	a1, a1, 1916
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_15
.LBB0_15:                               #   Parent Loop BB0_13 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
	lui	a0, 1048575
	addi	a0, a0, 1916
	add	a0, s0, a0
	lw	a1, 0(a0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_18
	j	.LBB0_16
.LBB0_16:                               #   in Loop: Header=BB0_15 Depth=2
	lui	a0, 1048575
	addi	a0, a0, 1920
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a1, zero, 20
	mul	a0, a0, a1
	lui	a1, 1048575
	addi	a1, a1, 1888
	add	a1, s0, a1
	sw	a0, 0(a1)                       # 4-byte Folded Spill
	lui	a1, 1048575
	addi	a1, a1, 1916
	add	a1, s0, a1
	lw	a1, 0(a1)
	slli	a1, a1, 2
	lui	a2, 1048575
	addi	a2, a2, 1892
	add	a2, s0, a2
	sw	a1, 0(a2)                       # 4-byte Folded Spill
	add	a0, a0, a1
	lui	a1, %hi(image)
	addi	a1, a1, %lo(image)
	add	a0, a0, a1
	lw	a0, 0(a0)
	slli	a1, a0, 2
	lui	a0, 1048575
	addi	a0, a0, 2028
	add	a0, s0, a0
	add	a0, a0, a1
	lw	a0, 0(a0)
	call	__floatsisf@plt
	lui	a1, 269440
	call	__divsf3@plt
	lui	a1, 276464
	call	__mulsf3@plt
	call	__fixsfsi@plt
	lui	a1, 1048575
	addi	a1, a1, 1888
	add	a1, s0, a1
	lw	a3, 0(a1)                       # 4-byte Folded Reload
	lui	a1, 1048575
	addi	a1, a1, 1892
	add	a1, s0, a1
	lw	a2, 0(a1)                       # 4-byte Folded Reload
	lui	a1, 1048575
	addi	a1, a1, 1924
	add	a1, s0, a1
	add	a1, a1, a3
	add	a1, a1, a2
	sw	a0, 0(a1)
	j	.LBB0_17
.LBB0_17:                               #   in Loop: Header=BB0_15 Depth=2
	lui	a0, 1048575
	addi	a0, a0, 1916
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a0, a0, 1
	lui	a1, 1048575
	addi	a1, a1, 1916
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_15
.LBB0_18:                               #   in Loop: Header=BB0_13 Depth=1
	j	.LBB0_19
.LBB0_19:                               #   in Loop: Header=BB0_13 Depth=1
	lui	a0, 1048575
	addi	a0, a0, 1920
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a0, a0, 1
	lui	a1, 1048575
	addi	a1, a1, 1920
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_13
.LBB0_20:
	lui	a0, %hi(.L.str)
	addi	a0, a0, %lo(.L.str)
	call	printf
	mv	a0, zero
	lui	a1, 1048575
	addi	a1, a1, 1912
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_21
.LBB0_21:                               # =>This Loop Header: Depth=1
                                        #     Child Loop BB0_23 Depth 2
	lui	a0, 1048575
	addi	a0, a0, 1912
	add	a0, s0, a0
	lw	a1, 0(a0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_28
	j	.LBB0_22
.LBB0_22:                               #   in Loop: Header=BB0_21 Depth=1
	mv	a0, zero
	lui	a1, 1048575
	addi	a1, a1, 1908
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_23
.LBB0_23:                               #   Parent Loop BB0_21 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
	lui	a0, 1048575
	addi	a0, a0, 1908
	add	a0, s0, a0
	lw	a1, 0(a0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_26
	j	.LBB0_24
.LBB0_24:                               #   in Loop: Header=BB0_23 Depth=2
	lui	a0, 1048575
	addi	a0, a0, 1912
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a1, zero, 20
	mul	a0, a0, a1
	lui	a1, 1048575
	addi	a1, a1, 1908
	add	a1, s0, a1
	lw	a1, 0(a1)
	slli	a1, a1, 2
	add	a0, a0, a1
	lui	a1, %hi(image)
	addi	a1, a1, %lo(image)
	add	a0, a0, a1
	lw	a1, 0(a0)
	lui	a0, %hi(.L.str.1)
	addi	a0, a0, %lo(.L.str.1)
	call	printf
	j	.LBB0_25
.LBB0_25:                               #   in Loop: Header=BB0_23 Depth=2
	lui	a0, 1048575
	addi	a0, a0, 1908
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a0, a0, 1
	lui	a1, 1048575
	addi	a1, a1, 1908
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_23
.LBB0_26:                               #   in Loop: Header=BB0_21 Depth=1
	lui	a0, %hi(.L.str.2)
	addi	a0, a0, %lo(.L.str.2)
	call	printf
	j	.LBB0_27
.LBB0_27:                               #   in Loop: Header=BB0_21 Depth=1
	lui	a0, 1048575
	addi	a0, a0, 1912
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a0, a0, 1
	lui	a1, 1048575
	addi	a1, a1, 1912
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_21
.LBB0_28:
	lui	a0, %hi(.L.str.3)
	addi	a0, a0, %lo(.L.str.3)
	call	printf
	mv	a0, zero
	lui	a1, 1048575
	addi	a1, a1, 1904
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_29
.LBB0_29:                               # =>This Loop Header: Depth=1
                                        #     Child Loop BB0_31 Depth 2
	lui	a0, 1048575
	addi	a0, a0, 1904
	add	a0, s0, a0
	lw	a1, 0(a0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_36
	j	.LBB0_30
.LBB0_30:                               #   in Loop: Header=BB0_29 Depth=1
	mv	a0, zero
	lui	a1, 1048575
	addi	a1, a1, 1900
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_31
.LBB0_31:                               #   Parent Loop BB0_29 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
	lui	a0, 1048575
	addi	a0, a0, 1900
	add	a0, s0, a0
	lw	a1, 0(a0)
	addi	a0, zero, 4
	blt	a0, a1, .LBB0_34
	j	.LBB0_32
.LBB0_32:                               #   in Loop: Header=BB0_31 Depth=2
	lui	a0, 1048575
	addi	a0, a0, 1904
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a1, zero, 20
	mul	a1, a0, a1
	lui	a0, 1048575
	addi	a0, a0, 1924
	add	a0, s0, a0
	add	a0, a0, a1
	lui	a1, 1048575
	addi	a1, a1, 1900
	add	a1, s0, a1
	lw	a1, 0(a1)
	slli	a1, a1, 2
	add	a0, a0, a1
	lw	a1, 0(a0)
	lui	a0, %hi(.L.str.1)
	addi	a0, a0, %lo(.L.str.1)
	call	printf
	j	.LBB0_33
.LBB0_33:                               #   in Loop: Header=BB0_31 Depth=2
	lui	a0, 1048575
	addi	a0, a0, 1900
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a0, a0, 1
	lui	a1, 1048575
	addi	a1, a1, 1900
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_31
.LBB0_34:                               #   in Loop: Header=BB0_29 Depth=1
	lui	a0, %hi(.L.str.2)
	addi	a0, a0, %lo(.L.str.2)
	call	printf
	j	.LBB0_35
.LBB0_35:                               #   in Loop: Header=BB0_29 Depth=1
	lui	a0, 1048575
	addi	a0, a0, 1904
	add	a0, s0, a0
	lw	a0, 0(a0)
	addi	a0, a0, 1
	lui	a1, 1048575
	addi	a1, a1, 1904
	add	a1, s0, a1
	sw	a0, 0(a1)
	j	.LBB0_29
.LBB0_36:
	addi	sp, sp, 176
	lw	s0, 2024(sp)                    # 4-byte Folded Reload
	lw	ra, 2028(sp)                    # 4-byte Folded Reload
	addi	sp, sp, 2032
	ret
.Lfunc_end0:
	.size	histogram_equalization, .Lfunc_end0-histogram_equalization
                                        # -- End function
	.globl	main                            # -- Begin function main
	.p2align	2
	.type	main,@function
main:                                   # @main
# %bb.0:
	addi	sp, sp, -16
	sw	ra, 12(sp)                      # 4-byte Folded Spill
	sw	s0, 8(sp)                       # 4-byte Folded Spill
	addi	s0, sp, 16
	mv	a0, zero
	sw	a0, -16(s0)                     # 4-byte Folded Spill
	sw	a0, -12(s0)
	call	histogram_equalization
	lw	a0, -16(s0)                     # 4-byte Folded Reload
	lw	s0, 8(sp)                       # 4-byte Folded Reload
	lw	ra, 12(sp)                      # 4-byte Folded Reload
	addi	sp, sp, 16
	ret
.Lfunc_end1:
	.size	main, .Lfunc_end1-main
                                        # -- End function
	.type	image,@object                   # @image
	.data
	.globl	image
	.p2align	2
image:
	.word	120                             # 0x78
	.word	50                              # 0x32
	.word	200                             # 0xc8
	.word	30                              # 0x1e
	.word	80                              # 0x50
	.word	90                              # 0x5a
	.word	180                             # 0xb4
	.word	60                              # 0x3c
	.word	40                              # 0x28
	.word	140                             # 0x8c
	.word	70                              # 0x46
	.word	20                              # 0x14
	.word	110                             # 0x6e
	.word	10                              # 0xa
	.word	160                             # 0xa0
	.word	130                             # 0x82
	.word	100                             # 0x64
	.word	150                             # 0x96
	.word	190                             # 0xbe
	.word	220                             # 0xdc
	.word	30                              # 0x1e
	.word	80                              # 0x50
	.word	120                             # 0x78
	.word	50                              # 0x32
	.word	200                             # 0xc8
	.size	image, 100

	.type	.L.str,@object                  # @.str
	.section	.rodata.str1.1,"aMS",@progbits,1
.L.str:
	.asciz	"Original Image:\n"
	.size	.L.str, 17

	.type	.L.str.1,@object                # @.str.1
.L.str.1:
	.asciz	"%d "
	.size	.L.str.1, 4

	.type	.L.str.2,@object                # @.str.2
.L.str.2:
	.asciz	"\n"
	.size	.L.str.2, 2

	.type	.L.str.3,@object                # @.str.3
.L.str.3:
	.asciz	"\nEqualized Image:\n"
	.size	.L.str.3, 19

	.ident	"riscv32-embecosm-clang-ubuntu1804-20210509 clang version 13.0.0 (https://mirrors.git.embecosm.com/mirrors/llvm-project.git 4aec8f4ce0f564aa68c23b9e29c2e3a945eec947)"
	.section	".note.GNU-stack","",@progbits
	.addrsig
	.addrsig_sym histogram_equalization
	.addrsig_sym printf
	.addrsig_sym image

Problem Analyze:

terry@ubuntu:~/qemu$ ./configure --target-list=riscv64-softmmu --prefix=$RISCV/qemu
ERROR: "cc" either does not exist or does not work
==========================
sudo apt-get update
sudo apt-get install build-essential
==========================

didn't find the header file

terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-clang -march=rv32ima -mabi=ilp32 -o sobel.elf sobel.c
sobel.c:1:10: fatal error: 'stdio.h' file not found
#include <stdio.h>
         ^~~~~~~~~
1 error generated.
====================
export C_INCLUDE_PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/riscv32-unknown-elf/include
====================

need to find the crt0.o, libc.a Path

terry@ubuntu:~/project$ ~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-clang -march=rv32ima -mabi=ilp32 -o sobel.elf sobel.c
/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-ld: cannot find crt0.o: No such file or directory
/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-ld: cannot find -lc
/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-ld: cannot find -lgloss
clang-13: error: ld command failed with exit code 1 (use -v to see invocation)
=======================
export LIBRARY_PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/riscv32-unknown-elf/lib
export LDFLAGS="-L/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf/lib -lc -lgloss"
~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/bin/riscv32-unknown-elf-clang -march=rv32ima -mabi=ilp32 -o sobel.elf sobel.c \
--sysroot=/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf \
-L/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf/lib \
-lc -lgloss -lm
=======================

  CC      hw/display/edid-generate.o
  CC      scsi/qemu-pr-helper.o
make: *** No rule to make target 'riscv64-softmmu/config-devices.mak', needed by 'config-all-devices.mak'.  Stop.
make: *** Waiting for unfinished jobs....
  CC      qemu-bridge-helper.o
  =====================
  ./configure --target-list=riscv64-softmmu --prefix=$RISCV/qemu
make -j $(nproc)
  =====================

Remote branch

terry@ubuntu:~/project/qemu$ git clone --branch rvv-1.0-upstream-v7-fix https://github.com/sifive/qemu.git
Cloning into 'qemu'...
fatal: Remote branch rvv-1.0-upstream-v7-fix not found in upstream origin
=================
git ls-remote --heads https://github.com/sifive/qemu.git
=================

fatal error: cannot execute 'cc1'

terry@ubuntu:~/project$ CROSS_COMPILE=riscv64-unknown-linux-gnu- make install riscv64-unknown-linux-gnu-gcc: fatal error: cannot execute 'cc1': execvp: No such file or directory compilation terminated. CC applets/applets.o riscv64-unknown-linux-gnu-gcc: fatal error: cannot execute 'cc1': execvp: No such file or directory compilation terminated. make[1]: *** [scripts/Makefile.build:198: applets/applets.o] Error 1 make: *** [Makefile:372: applets_dir] Error 2

I found this page and try to ckeck my toolchain path
https://stackoverflow.com/questions/56810443/gcc-without-full-path-error-trying-to-exec-cc1-execvp-no-such-file-or-direc
this instruction let my therminal window get closed, so I can't show result

terry@ubuntu:~/project$ qemu-system-riscv32 -nographic -kernel ./sobel.elf

and I try again , Finally can be executed

sudo  ./configure --target-list=riscv32-linux-user --prefix=$RISCV/qemu
sudo apt install qemu-user
==================
terry@ubuntu:~/qemu$ qemu-riscv32 ./hello.elf
Hello World
==================

Environment Variables

export PATH=/home/terry/riscv-gnu-toolchain/build-gcc-newlib-stage1/gcc/:$PATH
export PATH=~/project/llvm-clang/bin:$PATH
export PATH=~/project/gcc/bin:$PATH
export PATH=~/qemu/build:$PATH
export PATH="/qemu/bin:$PATH"
export PATH=$PATH:/opt/riscv32/bin
export PATH=/project/gc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/bin:$PATH
export PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/bin:$PATH
export LIBRARY_PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/riscv32-unknown-elf/lib
export C_INCLUDE_PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/riscv32-unknown-elf/include
export LIBRARY_PATH=~/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/lib/rv32:$LIBRARY_PATH
export LIBRARY_PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/riscv32-unknown-elf/lib:$LIBRARY_PATH
export LIBRARY_PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/riscv32-unknown-elf/lib/rv32imafc/ilp32f:$LIBRARY_PATH
export PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/bin:$PATH
export LIBRARY_PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/riscv32-unknown-elf/lib/rv32imafc/ilp32f:$LIBRARY_PATH
export PATH=~/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/bin:$PATH
export LIBRARY_PATH=/home/terry/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/riscv32-unknown-elf/lib/rv32im/ilp32/crt0.o:$LIBRARY_PATH
export LIBRARY_PATH=/home/terry/project/gcc/riscv32-embecosm-gcc-ubuntu1804-20210523-defaultnewlib/riscv32-unknown-elf/lib/crt0.o:$LIBRARY_PATH
export LIBRARY_PATH=/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf/lib/crt0.o:$LIBRARY_PATH
export LDFLAGS="-L/home/terry/project/llvm-clang/riscv32-embecosm-clang-ubuntu1804-20210509/riscv32-unknown-elf/lib -lc -lgloss"

I want to try use qemu-system-riscv32 execution .elf on QEMU，but I didn't findout how to print the result at this UI

And I trying to find which instruction can i use

terry@ubuntu:~/qemu/linux-user/riscv$ ls
cpu_loop.c  sockbits.h	    syscall64_nr.h  target_cpu.h  target_fcntl.h   target_structs.h  termbits.h
signal.c    syscall32_nr.h  syscall_nr.h    target_elf.h  target_signal.h  target_syscall.h

ref : https://lists.riscv.org/g/tech-vector-ext/topic/85564316?p=,20,0,0,0::recentpostdate/sticky,20,2,0,85564316

Not working.