Computer Architecture Final Project

Contributed by: < HotMercury (p76111741) >, < freshLiver (P76114016) >, < tinhanho (P76121364) >

Summary

<HotMercury (p76111741)>
<freshLiver (P76114016)>
- Fix Issue #258 (Immediate Bit Range Checking)
- Add testing unit for checking bit range
<tinhanho (P76121364)>
- Add cache replacement policy: FIFO

Setup

Build from Source

Clone the official repo and use the provided dockerfile to build the environment
- It may take long time to build the environment
Clone your ripes repo
- MUST clone with --recurse-submodules option, or you will fail to build it!!!
- Otherwise, conduct git submodule update --init --recursive after cloning because there are external sources called VSRTL, ELFIO and libelfin.
Enter the environment with docker run --rm --name=ripes -it --entrypoint=/bin/bash ripes:latest
- Add new user in the docker with useradd -m user (assume the host $UID is also 1000)
- Commit the (in another terminal, make sure the docker guest is idle) with docker commit ripes cafinal:user
- Stop the current guest with docker stop ripes
Enter the environment with docker run --rm --name=ripes -u 1000 -it -v $YOUR_RIPES_DIR:/ripes cafinal:user:
- This command maps your ripes directory into guest environment
- This command runs your environment in user permission (instead of using root permission, for keeping the owner and group)
- Don't forget to REPLACE $YOUR_RIPES_DIR in the command
Goto the mapped dir inside the guest environment

Build your ripes with the following command

$ cmake -S . -B ./build -Wno-dev  -DRIPES_BUILD_TESTS=ON -DVSRTL_BUILD_TESTS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/6.5.0/gcc_64/lib/cmake"
$ make -C build

Your Ripes should be successfully built
Test your ripes
- Goto the test directory inside the build path cd /ripes/build/test
- Testing ./tst_assembler && ./tst_expreval && ./tst_riscv

The tests should be all passed:

$ ./tst_assembler && ./tst_expreval && ./tst_riscv
[...]
Totals: 10 passed, 0 failed, 0 skipped, 0 blacklisted, 11495ms
********* Finished testing of tst_RISCV *********

Run it

Run the compiled binary with the following command

docker run --rm -it -v $YOUR_RIPES_DIR:/ripes -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --entrypoint=/ripes/build/Ripes cafinal:user

Build in local

git clone ripes
- Suffix --recurse-submodules are recommended.

Make sure that all tools below are installed.

Notice the version that using qt6 instead of qt5.

sudo apt install qt6-base-dev
sudo apt-get install qt6-tools-dev
sudo apt-get install qt6-tools-dev-tools
sudo apt-get install libqt6charts6-dev

Make sure that submodule(VSRTL ELFIO and libelfin) are all included.
- If they are not, we shall update them. git submodule update --init --recursive
Mind that if 6.5.0 directory exists or not. If not, conduct sudo aqt install-qt linux desktop 6.5.0 gcc_64 -m qtcharts

Under ripes directory

Note that Cmake should create build directory and configuration in the directory.

cmake -B build \
-Wno-dev            \
-DRIPES_BUILD_TESTS=ON \
-DVSRTL_BUILD_TESTS=ON \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_PREFIX_PATH=$(pwd)/6.5.0/gcc_64/

Under build directory conduct sudo make install
Execute ./Ripes
- Note that use sudo ./Ripes if we want to test system call which likes file open and so on.

Issue

Issue #196

We found that when using the open system call, the mode permission settings need to be specified in octal form. According to the Linux man page, when we set 00200, it means the user has write permission. However, when attempting a write operation, we observed a write error. The issue appears to be related to Ripes interpreting the set number as decimal instead of the octal mode.
Read write test in Ripes

.data
file_address:
    .asciz "/tmp/hello_world.txt"
greeting:
    .asciz "hello world"
open_error:
    .asciz "open error"
read_error:
    .asciz "read error"
write_error:
    .asciz "write error"
successs:
    .asciz "success!"

.text
main:
    # open syscall 1024
    # a0 Pointer to null terminated string for the path
    # a1 flags
    # flags -> O_RDONLY 00, O_WRONLY 01, O_RDWR 02
    # S_IWUSR  00200 user has write permission
    li s0, -1
    la a0, file_address
    li a1, 0x2
    
    # here is the problem
    ori a1, a1, 00200
    
    li a7, 1024
    ecall
    beq a0, s0, open_fail
    
    # write syscall 64
    # a0 the file discriptor
    # a1 address of the buffer
    # a2 number of bytes to write
    addi s1, a0, 0
    la a1, greeting
    li a2, 11
    li a7, 64
    ecall
    beq a0, s0, write_fail
    
    
close:   
    # close syscall 57
    # a0 the file descriptor to close
    mv a0, s1
    li a7, 57
    ecall
    j end
    
open_fail:
    la a0, open_error
    li a7, 4
    ecall
    j end
     
write_fail:
    la a0, write_error
    li a7, 4
    ecall 
    j close
     
 end:
     nop

When using the open(fd, flags) syscall implemented by Ripes, it cannot write data to the file if we set the a1 (flags) to 1 (O_WRONLY) or 2 (O_RDWR). Instead, we need to set it to 3.

However, the QFile::write doesn't require the flag to be 3, we can write a simple program to test:

// test.cpp
#include <QFile>

using namespace std;

int main (int argc, const char *argv[1]) {
    const char* msg = argv[2];
    QFile file(argv[1]);

    file.open(QIODevice::WriteOnly);
    file.write(msg, qstrlen(msg));
    file.close();
    return 0;
}

Then, create a CMakeLists.txt file with the following lines:

cmake_minimum_required(VERSION 3.16)
project(test)
find_package(Qt6 COMPONENTS Core REQUIRED)
qt_add_executable(test test.cpp)

And build and test the simple program with:

$ cmake -S . -B build -DCMAKE_PREFIX_PATH="/6.5.0/gcc_64/lib/cmake"
$ cmake --build build
$ ./build/test yoyo aaaa
$ cat yoyo

The created file yoyo should contain the string aaaa.

If we check the write syscall implementation, we could find that it explicitly requires O_WRONLY | O_RDWR:

// src/syscall/systemio.h
static int writeToFile(int fd, const QString &myBuffer, int lengthRequested) {
SystemIO::get(); // Ensure that SystemIO is constructed
if (fd == STDOUT || fd == STDERR) {
  emit get().doPrint(myBuffer);
  return myBuffer.size();
}

if (!FileIOData::fdInUse(
        fd, O_WRONLY | O_RDWR)) // Check the existence of the "write" fd
{
  s_fileErrorString =
      "File descriptor " + QString::number(fd) + " is not open for writing";
  return -1;
}
// retrieve FileOutputStream from storage
auto &outputStream = FileIOData::getStreamInUse(fd);

outputStream << myBuffer;
outputStream.flush();
return lengthRequested;

} // end writeToFile

Why the implementation require the flags to be O_WRONLY | O_RDWR ???

Issue #258 (RP #339)

Details

Description

As the issue desciption said, in the latest Ripes, most of the I-type instructions will check whether the given immediate part can fit in the instruction limitation.

For example, the addi instruction will ensure the given immediate is less than 13 bits, as the following image shows:

However, the lui instruction's immediate part should only accept an immediate value that could fit in 20 bits, but current version doesn't handle this limitation correctly, as shown in the above image.

Tracing

To find the problem, I use std::cout to dump the width in the checkFitsInWidth function, which is for checking the immediate range:

static Result<> checkFitsInWidth(Reg_T_S value, const Location &sourceLine,
                                  ImmConvInfo &convInfo,
                                  QString token = QString()) {

  std::cout << __PRETTY_FUNCTION__ << std::endl
            << "check token '" << token.toStdString() << "' (expected width=" << width
            << ") at #" << sourceLine.sourceLine() << std::endl;

  ...

Then, rebuild the Ripes and test it with the following codes:

main:
    addi x1, x1, 0x123455
    lui x1, 0x12345678

And found that the width of lui instruction is misconfigured as 32, instead of 20:

static Ripes::Result<> Ripes::ImmBase<tokenIndex, width, repr, ImmParts, symbolType, transformer>::checkFitsInWidth(Ripes::ImmBase<tokenIndex, width, repr, ImmParts, symbolType, transformer>::Reg_T_S, const Ripes::Location&, Ripes::ImmConvInfo&, QString) [with unsigned int tokenIndex = 2; unsigned int width = 12; Ripes::Repr repr = Ripes::Repr::Signed; ImmParts = Ripes::ImmPartBase<0, Ripes::BitRange<20, 31, 32> >; Ripes::SymbolType symbolType = Ripes::SymbolType::None; Ripes::Reg_T (* transformer)(Ripes::Reg_T) = Ripes::defaultTransformer; Ripes::ImmBase<tokenIndex, width, repr, ImmParts, symbolType, transformer>::Reg_T_S = long int]
check token '0x123455' (expected width=12) at #1
static Ripes::Result<> Ripes::ImmBase<tokenIndex, width, repr, ImmParts, symbolType, transformer>::checkFitsInWidth(Ripes::ImmBase<tokenIndex, width, repr, ImmParts, symbolType, transformer>::Reg_T_S, const Ripes::Location&, Ripes::ImmConvInfo&, QString) [with unsigned int tokenIndex = 1; unsigned int width = 32; Ripes::Repr repr = Ripes::Repr::Hex; ImmParts = Ripes::ImmPartBase<0, Ripes::BitRange<12, 31, 32> >; Ripes::SymbolType symbolType = Ripes::SymbolType::None; Ripes::Reg_T (* transformer)(Ripes::Reg_T) = Ripes::defaultTransformer; Ripes::ImmBase<tokenIndex, width, repr, ImmParts, symbolType, transformer>::Reg_T_S = long int]
check token '0x12345678' (expected width=32) at #2

And this is origin from the the U-type instruction implementation, we can see that the width is configured as 32:

/// A RISC-V immediate field with an input width of 32 bits.
/// Used in U-Type instructions.
///
/// It is defined as:
///  - Imm[31:12] = Inst[31:12]
///  - Imm[11:0]  = 0
constexpr static unsigned VALID_INDEX = 1;
template <unsigned index, SymbolType symbolType>
struct ImmU
    : public ImmSym<index, 32, Repr::Hex, ImmPart<0, 12, 31>, symbolType> {
  static_assert(index == VALID_INDEX, "Invalid token index");
};

/// A U-Type RISC-V instruction
template <typename InstrImpl, RVISA::OpcodeID opcodeID,
          SymbolType symbolType = SymbolType::None>
class Instr : public RV_Instruction<InstrImpl> {
  template <unsigned index>
  using Imm = ImmU<index, symbolType>;

public:
  struct Opcode : public OpcodeSet<OpPartOpcode<opcodeID>> {};
  struct Fields : public FieldSet<RegRd, Imm> {};
};

struct Auipc
    : public Instr<Auipc, RVISA::OpcodeID::AUIPC, SymbolType::Absolute> {
  constexpr static std::string_view NAME = "auipc";
};

struct Lui : public Instr<Lui, RVISA::OpcodeID::LUI> {
  constexpr static std::string_view NAME = "lui";
};

} // namespace TypeU

Solve

By changing the width to 20, the bit range checking now works as expected, on the lui and auipc now:

Testing

Currently I only use the provided testing utility for checking my changes didn't break it:

/ripes/build/test$ ./tst_assembler && ./tst_expreval && ./tst_riscv

...

Totals: 10 passed, 0 failed, 0 skipped, 0 blacklisted, 9676ms
********* Finished testing of tst_RISCV *********

However, since I'm not that familiar with C++, I'm not sure whether I understood the bit range checking process correctly. If there is any problem in my changes, please me know.

Add testing unit (Commit #85e406b)

Minor Issues

Typo (PR #336, Commit #07e2ef9)

In file Ripes/src/syscall/ripes_syscall.h:


/**
 * @brief The SyscallManager class
 *
 * It is expected that the syscallManager can be called outside of the main GUI
 * thread. As such, all syscalls who require GUI interaction must handle this
 * explicitely.
 */
class SyscallManager {
    ...
}

The word explicitely should be explicitly.

Interactions

Hi @matsievskiysv

According to the code in systemio.h at line 80,

 static constexpr int O_RDONLY = 0x00000000;
 static constexpr int O_WRONLY = 0x00000001;
 static constexpr int O_RDWR = 0x00000002;

which defines the value of the flags used by the open syscall.

However, it cannot write data into the file correctly, if we only set O_WRONLY when calling open syscall.

Looking at the conditional statements at lines 408-409, it input fd and 3 as parameter

if (!FileIOData::fdInUse(fd, O_WRONLY | O_RDWR))

and then checks for the existence of the "write" flag using fdInUse at line193

else if ((fileFlags[fd] & flag) == static_cast<unsigned>(flag))

it requires the flag to be 0x3 to be true. Therefore, setting it only to O_WRONLY may result in a write failure.

Hi @mortbopet
Ideally, writing to a file should be allowed as long as it satisfies either O_WRONLY or O_RDWR. However, here it is specified that both conditions must be met for a write operation to proceed. I would like to inquire whether the author has any specific reason for this additional requirement.

Add Replacement Mechanism

Motivation

Ripes #334

Hi @mortbopet,
I find that contributing to the cache replacement policy is a good way to work with. The replacement policy which possesses from now on are Random and LRU. Intuitively, FIFO can be added and it has been done through few lines of code. What I am doing now is trying to add policy like least frequently used. However, I cannot find out where the code is to expand the slot just like LRU doing. For example, when 2 ways cache used, you click Repl. policy LRU and the LRU slots show up.

I guess that it might be the UI problem but I do not know how to modify it. Could you help me?

Pull Request

Link

About Issue #334.

I think that we could add a FIFO mechanism to the cache policy.

A new replacement policy, FIFO (First In, First Out), has been added to the existing enumeration. Additionally, a new counter has been introduced in the cachesim.h file. This counter plays a crucial role in cyclically determining which cache entry should be evicted.

// line 19 in cachesim.h
enum ReplPolicy { Random, LRU, FIFO};

The code below handles the eviction process based on the FIFO replacement policy.

//line 124 in cachesim.cpp
else if (m_replPolicy == ReplPolicy::FIFO){
    ew.first = CacheSim::counter;
    ew.second = &cacheLine[ew.first];
    CacheSim::counter += 1;
CacheSim::counter %= getWays();
}

Author Reply

Thank you for looking into adding a new cache replacement policy!

A few comments:

As per the coding style currently used in Ripes, member variables should be prefixed with m_ and not with the class name.
I think the counter should be named better, e.g. fifoIndexCounter.
(optional) do you think there is an accompanying visualization to this? i.e., for LRU, we also show the LRU bits. Could one imagine a similar column which indicates what way in the cache line is currently up for eviction as per. FIFO?

Work on Author's suggestion

Thanks for the reply. I had already renamed the counter and tried my best to follow the coding style. If there are any problem still, please let me know.

As for third comments, I rework all the framework to implement visualizing the FIFO bit. Now, there is no need about fifoIndexCounter. Instead, boolean fifoflag is presented. This flag is set under two circumstances,

When an invalid entry is selected, we set the fifoflag.
When all the entries are full and cache miss occurs, we need to choose a entry to evicted and we set the fifoflag.

When this flag is set, we shall add 1 to fifo bits if entry is valid. In this way, we'll find that when fifo bits equal to the way of the cache, that entry should be evicted.

//Line 167 in cachesim.cpp
if (it != cacheLine.end()) {
  ew.first = it->first;
  ew.second = &it->second;
  m_fifoflag = true;
}
if (ew.second == nullptr) {
  for (auto &way : cacheLine) {
    if (static_cast<long>(way.second.fifo) == getWays()){
      ew.first = way.first;
      ew.second = &way.second;
      m_fifoflag = true;
      break;
    }
  }
}

Furthermore, the undo part needs to be taken into consideration as well. If the policy is FIFO, we set the fifoflag again and we need to restore the oldway.

//Line 442 in cachesim.cpp
if (!trace.transaction.isHit && getReplacementPolicy() == ReplPolicy::FIFO) {
  m_fifoflag = true;
  way = oldWay;
}

//Line 82 in cachesim.cpp
if (getReplacementPolicy() == ReplPolicy::FIFO) {
  for(auto &set : line){
    if(set.second.valid && m_fifoflag) set.second.fifo--;
  }
  m_fifoflag = false;
  line[wayIdx].fifo = oldWay.fifo;
}

Demo video here,
https://github.com/mortbopet/Ripes/assets/67796326/b67aab30-17ee-4b3c-8fa7-ba56f905a282

Demo video demostrates the assembly code below,

lw a1 0(x0)
lw a1 512(x0)
lw a1 0(x0)
lw a1 512(x0)
lw a1 512(x0)
lw a1 1024(x0)
lw a1 1024(x0)
lw a1 1024(x0)
lw a1 1024(x0)
lw a1 0(x0)
lw a1 0(x0)
lw a1 0(x0)
lw a1 1536(x0)
lw a1 1536(x0)
lw a1 1536(x0)
addi a0 x0 1024
addi a0 a0 1024
lw a1 0(a0)

Full code is in the branch FIFO of my fork,
https://github.com/tinhanho/Ripes/commit/6772d1abec6a6bff5e0e40374fd5922c67e3a427

Let me know if there are any problems of my think and implement.

Trace Code

Instruction Definition

The base of the instruction is the class InstructionBase defined in the src/isa/instruction.h.

/** @brief A no-template, abstract class that defines an instruction. */
class InstructionBase {
public:
  InstructionBase(unsigned byteSize) : m_byteSize(byteSize) {}
  virtual ~InstructionBase() = default;
  /// Assembles a line of tokens into an encoded program.
  virtual AssembleRes assemble(const TokenizedSrcLine &tokens) = 0;
  /// Disassembles an encoded program into a tokenized assembly program.
  virtual Result<LineTokens>
  disassemble(const Instr_T instruction, const Reg_T address,
              const ReverseSymbolMap &symbolMap) const = 0;
  ...

  /**
   * @brief size
   * @return size of assembled instruction, in bytes.
   */
  unsigned size() const { return m_byteSize; }
  ...
protected:
  ...
  unsigned m_byteSize;
};

And this class is inheritted by the Instruction structure:

template <typename InstrImpl>
struct Instruction : public InstructionBase {
  Instruction()
      : InstructionBase(InstrByteSize<InstrImpl>::byteSize),
        m_name(InstrImpl::NAME.data()) {}

  AssembleRes assemble(const TokenizedSrcLine &tokens) override {
    ...
  }
  Result<LineTokens>
  disassemble(const Instr_T instruction, const Reg_T address,
              const ReverseSymbolMap &symbolMap) const override {
    ...
  }
  const QString &name() const override { return m_name; }
  unsigned numOpParts() const override { return InstrImpl::Opcode::numParts(); }

private:
  const QString m_name;
};

This structure implement the assemble and disassemble functions used for assembling/disassembling the instructions at runtime. But

Since the real instruction implementations are defined in the src/isa/rv_[icm]_ext.h files, take I-type for example:

I-Type Definitions

The I-type instructions are defined under the namespace TypeI in src/isa/rv_i_ext.h:

namespace TypeI {

enum class Funct3 : unsigned {
  ADDI = 0b000,
  SLTI = 0b010,
  SLTIU = 0b011,
  XORI = 0b100,
  ORI = 0b110,
  ANDI = 0b111,
};
...

/// An I-Type RISC-V instruction
template <typename InstrImpl, OpcodeID opcodeID, Funct3 funct3>
struct Instr : public RV_Instruction<InstrImpl> {
  struct Opcode
      : public OpcodeSet<OpPartOpcode<opcodeID>,
                         OpPartFunct3<static_cast<unsigned>(funct3)>> {};
  struct Fields : public FieldSet<RegRd, RegRs1, ImmCommon12> {};
};
...

template <typename InstrImpl, Funct3 funct3>
using Instr32 = Instr<InstrImpl, OpcodeID::OPIMM, funct3>;
...

struct Addi : public Instr32<Addi, Funct3::ADDI> {
  constexpr static std::string_view NAME = "addi";
};
...

struct Jalr : public RV_Instruction<Jalr> {
  struct Opcode : public OpcodeSet<OpPartOpcode<RVISA::OpcodeID::JALR>,
                                   OpPartFunct3<static_cast<unsigned>(0b000)>> {
  };
  struct Fields : public FieldSet<RegRd, RegRs1, ImmCommon12> {};

  constexpr static std::string_view NAME = "jalr";
};
}

RV_Instruction is a simple wrapper of Instruction

// src/isa/rvisainfo_common.h
namespace RVISA {
...
template <typename InstrImpl>
struct RV_Instruction : public Instruction<InstrImpl> {
  constexpr static unsigned instrBits() { return INSTR_BITS; } // 32
};
...
}

These instructions have the common format imm[11:0] | rs1 | funct3 | rd | opcode (0b0010011) where the fields imm (RegRd), rs1 (RegRs1), rd (ImmCommon12) are available when parsing the assembly codes. So, in the definition of Addi, the most important thing is to define the funct3 (Funct3::ADDI).

The JALR instruction:

Note that an exception is the JALR instruction, its opcode is 0b1100111 and thus inherit the RV_Instruction directly.

U-Type Definitions

namespace TypeU {

constexpr static unsigned VALID_INDEX = 1;
template <unsigned index, SymbolType symbolType>
struct ImmU : public ImmSym<index, 32, Repr::Hex, ImmPart<0, 12, 31>, symbolType> {
  static_assert(index == VALID_INDEX, "Invalid token index");
};

/// A U-Type RISC-V instruction
template <typename InstrImpl, RVISA::OpcodeID opcodeID, SymbolType symbolType = SymbolType::None>
class Instr : public RV_Instruction<InstrImpl> {
  template <unsigned index>
  using Imm = ImmU<index, symbolType>;

public:
  struct Opcode : public OpcodeSet<OpPartOpcode<opcodeID>> {};
  struct Fields : public FieldSet<RegRd, Imm> {};
};

struct Auipc
    : public Instr<Auipc, RVISA::OpcodeID::AUIPC, SymbolType::Absolute> {
  constexpr static std::string_view NAME = "auipc";
};

struct Lui : public Instr<Lui, RVISA::OpcodeID::LUI> {
  constexpr static std::string_view NAME = "lui";
};
} // namespace TypeU

Instruction Initialization

During runtime, the instructions will be initialized by using the enableInstructions function, defined in the src/isa/instruction.h, to add the instructions into the InstrVec m_instructions of the RV_ISAInfoBase class:

// src/isa/rvisainfo_common.h
class RV_ISAInfoBase : public ISAInfoBase {
  ...
  const InstrVec &instructions() const override { return m_instructions; }
  ...
  void initialize(const std::set<Option> &options = {}) {
    RVISA::ExtI::enableExt(this, m_instructions, m_pseudoInstructions, options);
    ...
  }
  ...
}
// src/isa/rv_i_ext.cpp
namespace ExtI {
...
void enableExt(const ISAInfoBase *isa, InstrVec &instructions,
               PseudoInstrVec &pseudoInstructions,
               const std::set<Option> &options) {
  ...
  enableInstructions<Addi, Andi, Slti, Sltiu, Xori, Ori, Lb, Lh, Lw, Lbu, Lhu,
                     Ecall, Auipc, Lui, Jal, Jalr, Sb, Sw, Sh, Add, Sub, Sll,
                     Slt, Sltu, Xor, Srl, Sra, Or, And, Beq, Bne, Blt, Bge,
                     Bltu, Bgeu>(instructions);
  ...
}

Then, later will use the function setInstructions to initialize the map InstrMap m_instructionMap of all the previously defined instructions:

class Assembler : public AssemblerBase {
  ...
  void setInstructions() {
    if (m_instructionMap.size() != 0) {
      throw std::runtime_error("Instructions already set");
    }
    for (const auto &iter : m_isa->instructions()) {
      const auto instr_name = iter.get()->name();
      if (m_instructionMap.count(instr_name) != 0) {
        throw std::runtime_error("Error: instruction with opcode '" +
                                  instr_name.toStdString() +
                                  "' has already been registerred.");
      }
      m_instructionMap[instr_name] = iter;
    }
  }
  ...
}

Instruction Assembling

Then, the function Assembler::assemble, defined in src/assembler/assembler.h, will be the entry point for assembling the instructions, during runtime:

#define runPass(resName, resType, passFunction, ...)                           \
  auto passFunction##_res = passFunction(__VA_ARGS__);                         \
  if (auto *errors = std::get_if<Errors>(&passFunction##_res)) {               \
    result.errors.insert(result.errors.end(), errors->begin(), errors->end()); \
    assert(result.errors.size() != 0);                                         \
    return result;                                                             \
  }                                                                            \
  auto resName = std::get<resType>(passFunction##_res);

...

class Assembler : public AssemblerBase {
...
assemble(const QStringList &programLines, const SymbolMap *symbols = nullptr,
         const QString &sourceHash = QString()) const override {
  AssembleResult result;
  ... // tokenize and expand pseudo instructions
  /** Assemble. During assembly, we generate:
   * - linkageMap: Recording offsets of instructions which require linkage
   * with symbols
   */
  LinkRequests needsLinkage;
  runPass(program, Program, pass2, expandedLines, needsLinkage);
  // Symbol linkage
  runPass(unused, NoPassResult, pass3, program, needsLinkage);
  Q_UNUSED(unused);
  result.program = program;
  result.program.sourceHash = sourceHash;
  result.program.entryPoint = m_sectionBasePointers.at(".text");
  return result;
}
...
}

Then, in this function, the function pass2 is used for translating the instruction:

#define runOperation(resName, operationFunction, ...)                          \
  auto operationFunction##_res = operationFunction(__VA_ARGS__);               \
  if (operationFunction##_res.isError()) {                                     \
    errors.push_back(operationFunction##_res.error());                         \
    continue;                                                                  \
  }                                                                            \
  auto resName = operationFunction##_res.value();

...
class Assembler : public AssemblerBase {
std::variant<Errors, Program> pass2(const SourceProgram &tokenizedLines,
                                    LinkRequests &needsLinkage) const {
  ...// Initialize program with initialized segments:
  for (const auto &line : tokenizedLines) {
    ...// adjust the symbol addr based on the section base address
    ...// handle directive
    addr_offset = currentSection->data.size();
    if (!wasDirective) {
      /// Maintain a pointer to the instruction that was assembled.
      std::shared_ptr<InstructionBase> assembledWith;
      runOperation(machineCode, assembleInstruction, line, assembledWith);
      assert(assembledWith && "Expected the assembler instruction to be set");
      program.sourceMapping[addr_offset].insert(line.sourceLine());

      if (!machineCode.linksWithSymbol.symbol.isEmpty()) {
        LinkRequest req(line.sourceLine());
        req.offset = addr_offset;
        req.fieldRequest = machineCode.linksWithSymbol;
        req.section = m_currentSection;
        req.instrAlignment = m_isa->instrByteAlignment();
        needsLinkage.push_back(req);
      }
      ...// handle misalignment
      currentSection->data.append(
          QByteArray(reinterpret_cast<char *>(&machineCode.instruction),
                      assembledWith->size()));
    }
    // This was a directive; append any assembled bytes to the segment.
    currentSection->data.append(directiveBytes);
  }
  if (errors.size() != 0) {
    return {errors};
  }
  ...
  return {program};
}
}

And the function assembleInstruction will be used for assembling and checking an instruction, if any error is found, the error message will be pushed to the error list and highlighted on the editor:

virtual AssembleRes
assembleInstruction(const TokenizedSrcLine &line,
                    std::shared_ptr<InstructionBase> &assembledWith) const {
  if (line.tokens.empty()) {
    return {
        Error(line, "Empty source lines should be impossible at this point")};
  }
  const auto &opcode = line.tokens.at(0);
  auto instrIt = m_instructionMap.find(opcode);
  if (instrIt == m_instructionMap.end()) {
    return {Error(line, "Unknown opcode '" + opcode + "'")};
  }
  assembledWith = instrIt->second;
  return assembledWith->assemble(line);
}

It will first check whether the opcode (instruction name) is legal. If true, then it will retrieve the instruction implementation from Assembler::m_instructionMap, and call the assemble function implemented by that instruction.

As explained above, because the instruction implementations are inheritted from the Instruction structure, if the instruction didn't override the assemble function, the default assemble implementation should be defined by the Instruction structure:

template <typename InstrImpl>
struct Instruction : public InstructionBase {
  ...
  AssembleRes assemble(const TokenizedSrcLine &tokens) override {
      Instr_T instruction = 0;
      FieldLinkRequest linksWithSymbol;
      InstrImpl::Opcode::apply(instruction, linksWithSymbol);
      if (auto fieldRes = InstrImpl::Fields::apply(tokens, instruction, linksWithSymbol); fieldRes.isError()) {
        return std::get<Error>(fieldRes);
      }
      InstrRes res;
      res.linksWithSymbol = linksWithSymbol;
      res.instruction = instruction;
      return res;
  }
  ...
}

When the implementation being called, if the apply function didn't be overridden by the instruction implementation, it will first use that provided by struct OpPartBase defined in src/isa/instruction.h, to combine the fields into a complete instruction:

struct OpPartBase {
  ...
  const BitRangeBase range;
  ...
  constexpr void apply(Instr_T &instruction) const {
    instruction |= range.apply(value);
  }
  ...
}
...
struct BitRangeBase {
  ...
  constexpr unsigned width() const { return stop - start + 1; }
  constexpr Instr_T getMask() const { return vsrtl::generateBitmask(width()); }
  constexpr Instr_T apply(Instr_T value) const { return (value & getMask()) << start; }
  ...
}

SysCall Handling

// src/syscall/ripes_syscall.cpp
bool SyscallManager::execute(SyscallID id) {
  if (m_syscalls.count(id) == 0) {
    postToGUIThread([=] {
      if (auto reg = ProcessorHandler::currentISA()->syscallReg();
          reg.has_value()) {
        ...
      }
    });
    return false;
  } else {
    const auto &syscall = m_syscalls.at(id);
    ...
    syscall->execute();
    ...
    return true;
  }
}

When syscall->execute() is executed, the corresponding syscall implementation will be called. For the open syscall:

// src/syscall/file.h
template <typename BaseSyscall>
class OpenSyscall : public BaseSyscall {
  static_assert(std::is_base_of<Syscall, BaseSyscall>::value);

public:
  OpenSyscall()
      : BaseSyscall("Open", "Opens a file from a path",
                    {{0, "Pointer to null terminated string for the path"},
                     {1, "flags"}},
                    {{0, "the file decriptor or -1 if an error occurred"}}) {}
  void execute() {
    const AInt arg0 = BaseSyscall::getArg(BaseSyscall::REG_FILE, 0);
    const AInt arg1 = BaseSyscall::getArg(BaseSyscall::REG_FILE, 1);
    QByteArray string;
    char byte;
    unsigned int address = arg0;
    do {
      byte = static_cast<char>(
          ProcessorHandler::getMemory().readMemConst(address++, 1) & 0xFF);
      string.append(byte);
    } while (byte != '\0');

    int ret = SystemIO::openFile(QString::fromUtf8(string), arg1);

    BaseSyscall::setRet(BaseSyscall::REG_FILE, 0, ret);
  }
};

And then call the SystemIO::openFile(...):

// src/syscall/systemio.h
static int openFile(QString filename, int flags) {
    SystemIO::get(); // Ensure that SystemIO is constructed
    // Internally, a "file descriptor" is an index into a table
    // of the filename, flag, and the File???putStream associated with
    // that file descriptor.

    int retValue = -1;
    int fdToUse;

    // Check internal plausibility of opening this file
    fdToUse = FileIOData::nowOpening(filename, flags);
    retValue = fdToUse; // return value is the fd
    if (fdToUse < 0) {
      return -1;
    } // fileErrorString would have been set

    try {
      FileIOData::openFilestream(fdToUse, filename);
    } catch (int) {
      s_fileErrorString = "File " + filename + " could not be opened.";
      retValue = -1;
    }

    return retValue; // return the "file descriptor"
}

This function first allocate a fd for the specified file by using FileIOData::nowOpening(), which saves the parameter flags into an map called fileFlags[i]:

static int nowOpening(const QString &filename, int flag) {
  ...
  fileNames[i] = filename; // our table has its own copy of filename
  fileFlags[i] = flag;
  ...
  return i;
}

Then, it will try to open the file with FileIOData::openFilestream():

// src/syscall/systemio.h
static constexpr int O_RDONLY = 0x00000000;
static constexpr int O_WRONLY = 0x00000001;
static constexpr int O_RDWR = 0x00000002;
static constexpr int O_APPEND = 0x00000008;
static constexpr int O_CREAT = 0x00000200; // 512
static constexpr int O_TRUNC = 0x00000400; // 1024
static constexpr int O_EXCL = 0x00000800;  // 2048

...

static void openFilestream(int fd, const QString &filename) {
  files.emplace(fd, filename);

  const auto flags = fileFlags[fd];
  const auto qtOpenFlags = // Translate from stdlib file flags to Qt flags
      (flags & O_RDONLY ? QIODevice::ReadOnly : QIODevice::NotOpen) |
      (flags & O_WRONLY ? QIODevice::WriteOnly : QIODevice::NotOpen) |
      (flags & O_RDWR ? QIODevice::ReadWrite : QIODevice::NotOpen) |
      (flags & O_TRUNC ? QIODevice::Truncate : QIODevice::Append) |
      (flags & O_EXCL ? QIODevice::NewOnly : QIODevice::NotOpen);

  // Try to open file with the given flags
  files[fd].open(qtOpenFlags);

  ...
}

In this function, we can find that the open syscall relies on the QFile::open() provided by Qt. The given flags will be converted to the Qt defined flags:

Constant	Value
QIODeviceBase::NotOpen	0x0000
QIODeviceBase::ReadOnly	0x0001
QIODeviceBase::WriteOnly	0x0002
QIODeviceBase::ReadWrite	ReadOnly \| WriteOnly
QIODeviceBase::Append	0x0004
QIODeviceBase::Truncate	0x0008
QIODeviceBase::Text	0x0010
QIODeviceBase::Unbuffered	0x0020
QIODeviceBase::NewOnly	0x0040
QIODeviceBase::ExistingOnly	0x0080

Note that the Ripes defined flags listed above are NOT identical to the (standard? linux defined?) flags:

#define O_ACCMODE	00000003
#define O_RDONLY	00000000
#define O_WRONLY	00000001
#define O_RDWR		00000002
#ifndef O_CREAT
#define O_CREAT	00000100	/* not fcntl */
#endif
#ifndef O_EXCL
#define O_EXCL		00000200	/* not fcntl */
#endif
#ifndef O_NOCTTY
#define O_NOCTTY	00000400	/* not fcntl */
#endif
#ifndef O_TRUNC
#define O_TRUNC	00001000	/* not fcntl */
#endif
#ifndef O_APPEND
#define O_APPEND	00002000
#endif

Cache Handling

In src/cachesim/cachesim.h, there is a enumeration ReplPolicy which is used for listing the available cache replacement policies:

enum ReplPolicy { Random, LRU };

Select Victim

And in the src/cachesim/cachesim.h, the function CacheSim::access is the main function for accessing the cache. And in this function, if cache miss is happened, the function CacheSim::evictAndUpdate will be used for selecting the victim:

void CacheSim::access(AInt address, MemoryAccess::Type type) {
    ...
    if (!transaction.isHit) {
        if (type == MemoryAccess::Read 
        || (type == MemoryAccess::Write && getWriteAllocPolicy() == WriteAllocPolicy::WriteAllocate)) {
            oldWay = evictAndUpdate(transaction);
        }
    } else {
        oldWay = m_cacheLines[transaction.index.line][transaction.index.way];
    }
    ...
}

The function CacheSim::evictAndUpdate will first determine which way the victim should be selected from, by using the function CacheSim::locateEvictionWay. And after the victim way is determined, it will update the cache line flags:

CacheSim::CacheWay CacheSim::evictAndUpdate(CacheTransaction &transaction) {
    const auto [wayIdx, wayPtr] = locateEvictionWay(transaction);

    ... // ignored, explained later

    *wayPtr = CacheWay();
    wayPtr->valid = true;
    wayPtr->dirty = false;
    wayPtr->tag = getTag(transaction.address);
    ...
    return eviction;
}

However, we can find that the victim cache line is not updated directly, it's substituted with a new cache line instead. The reasons are:

The victim cache line may be dirty
The old value is be recorded for rolling back

Therefore, the function will also record the updates into transaction and return the evicted cache line.

CacheSim::CacheWay CacheSim::evictAndUpdate(CacheTransaction &transaction) {
    ...
    CacheWay eviction;

    if (!wayPtr->valid) {
        transaction.transToValid = true;
    } else {
        eviction = *wayPtr;
        if (eviction.dirty) {
            transaction.isWriteback = true;
        }
    }
    ...
    transaction.tagChanged = true;
    transaction.index.way = wayIdx;

    return eviction;
}

Select Victim Way

As mentioned before, the function CacheSim::locateEvictionWay is used for selecting the victim way.

In this function, it first check the replacement policy. It randomly select a way if ReplPolicy::Random is in used:

std::pair<unsigned, CacheSim::CacheWay *>
CacheSim::locateEvictionWay(const CacheTransaction &transaction) {
    ...
    std::pair<unsigned, CacheSim::CacheWay *> ew;
    ...
    if (m_replPolicy == ReplPolicy::Random) {
        ew.first = std::rand() % getWays();
        ew.second = &cacheLine[ew.first];
    }
    ...
    return ew;
}

Otherwise, the LRU will be performed to find the first invalid cache line. If all the cache lines are valid, it will select the LRU line:

std::pair<unsigned, CacheSim::CacheWay *>
CacheSim::locateEvictionWay(const CacheTransaction &transaction) {
    ...
    else if (m_replPolicy == ReplPolicy::LRU) {
        ...
        auto it = std::find_if(cacheLine.begin(), cacheLine.end(), [=](const auto &way) { return !way.second.valid; });
        if (it != cacheLine.end()) {
            ew.first = it->first;
            ew.second = &it->second;
        }
        if (ew.second == nullptr) {
            for (auto &way : cacheLine) {
                if (static_cast<long>(way.second.lru) == getWays() - 1) {
                    ew.first = way.first;
                    ew.second = &way.second;
                    break;
                }
            }
        }
    }
    ...
    return ew;
}