# Secure Boot Implementation and AMP architecture &nbsp;&nbsp;&nbsp;&nbsp;This document briefly documented the secure boot flow Hsien-Feng Ko has completed and implemented so far. Author: Hsien-Feng Ko (r08943169@ntu.edu.tw) *** ## Outline **I. Basic concept of secure boot** **II. Boot flow with OS kernel integrity check** - 1. FSBL Hooks provided by Xilinx - 2. Sequence Diagram of Boot Flow **III. Secure boot with Asymmetric Multiprocessing** - 1. Brief Introduction of Asymmetric Multiprocssing - 2. Implemented Architecture of Secure Boot with AMP **IV. References** &nbsp;&nbsp;&nbsp;&nbsp;This document first starts with some basic introduction and definition of secure boot and trusted boot, then proceed to introduce the traditional boot flow of embedded systems and the implemented boot flow with sequence diagram. Furthermore in the later sections, implemented secure boot flow along with the concept of asymmetric multiprocessing will be introduced. Lastly, for further detail or a more comprehensive understanding of the work completed so far, some of the references are attached in the *References* section. *** ## I. Basic Concept of Secure Boot &nbsp;&nbsp;&nbsp;&nbsp;*Secure Boot* or sometimes called *Trusted Boot*, by definition, is a boot mechanism which ensures that the code (software/firmware/operating system/bare-metal application) that is about to execute on a certain device is trusted. &nbsp;&nbsp;&nbsp;&nbsp; One of the most common approaches of implementation of a proper secure boot process involves the verification of the hash value of the target code, which will later be launched by the device. The hash value of the code is calculated prior to the execution of the code, and the value will be compared to the stored hash value of a trusted code. Only if the comparion result proves that the code is not maliciously modified will the device proceed to launch the code on the device. Also, the aforementioned method is the implemented approach of verification in the later sections. *** ## II. Boot Flow with OS Kernel Integrity Check The standard boot flow with an operating system (OS) for embedded systems is depicpted below: <center> ![](https://i.imgur.com/ivNhby7.png) Fig. 1. Typical boot flow for embedded systems <br> <br> </center> By checking the integrity of the OS kernel at *First Stage Boot Loader (FSBL)*, users can implement a simple secure boot flow with the current development environment (as the chart below): <center> ![](https://i.imgur.com/qZAoDvQ.png) Fig. 2. Implemented boot flow </center> The steps are described as followed: (1) When the power of the development board is on, the code within SoC BootROM will find the FSBL and check whether the FSBL code will cause malfunctioning or not, if not, the FSBL will be executed. (2) The FSBL could be either a standard template provided by Xilinx or a user-defined code, it mainly executes the following functions: &nbsp;&nbsp;&nbsp;&nbsp;(1) Initialization of pheripherals. &nbsp;&nbsp;&nbsp;&nbsp;(2) Initialization and loading of FPGA configuration (bitstream or .bit files.) &nbsp;&nbsp;&nbsp;&nbsp;(3) Executes user-defined codes, namely, the functions written inside the FSBL-hooks. **Note: the integrity check of the OS kernel is executed in this stage.* (3) Once the integrity check passes, FSBL will hand off the control to the next stege, which in this scenario, is the *Second Stage Boot Loader (SSBL)* **Note: In the case of booting a bare-metal application (without an operating system), FSBL will directly hand off tothe bare-metal application itselt.* (4) The SSBL will then bring up the verified kernel of the OS, and the root filesystem will be booted thereafter. ### 1. FSBL Hooks provided by Xilinx &nbsp;&nbsp;&nbsp;&nbsp;FSBL hooks are the predetermined breakpoints inside the standard template of a FSBL project generated using *Xilinx Software Development Tool (XSDK)*. It will execute user-defined functions at certain spot of time during the booting process (when booting into the FSBL stage as aformentioned.) Namely, the FSBL project provides four breakpoints where user can insert their own code/logic: &nbsp;&nbsp;&nbsp;&nbsp;(1) Before PL* bitstream download. &nbsp;&nbsp;&nbsp;&nbsp;(2) After PL bitstream download. &nbsp;&nbsp;&nbsp;&nbsp;(3) Before FSBL hands off to Second Stage Bootloader. &nbsp;&nbsp;&nbsp;&nbsp;(4) During FSBL fallback. For more detailed information, readers are encouraged to further read the Xilinx Confluence [1] *Programming Logic, PL: It is the region where FPGA logic is configured on a Xilinx System-on-Chip (SoC). As the counterpart of PL, the *Processing System, PS* exists as the processor and its software interface of a Xilinx SoC. ### 2. Sequence Diagram of Boot Flow The sequence diagram of the implemented secure boot flow is depicited as below: <center> ![](https://i.imgur.com/DZFvbBt.png) Fig. 3. Secure boot sequence diagram </center> &nbsp;&nbsp;&nbsp;&nbsp;There are two entities involved in the transcation of the secure boot sequence, which are the *HOST*, whose OS image is about to be verified, and *Secure Module*, the implemented hardware module whose task is to check whether the hash value of the OS kernel that is about to run on the HOST matches the stored hash value on the Secure Module. The detail of the sequence diagram is described as the following steps: (1) The HOST firstly boots to FSBL and halts right before handing off to SSBL, at which time spot is the *hook before handoff* mentioned in the previous section. (2) The HOST then requests the Secure Module for a integrity check for the OS kernel its SSBL is about to hand off to. (3) The Secure Module will then calculate the hash value of the OS kernel and compare it to the stored hash value. (4) [Case 1] If the calculated hash value matches the stored one, the Secure Module will send back the handoff signal enabling FSBL to hand off to the SSBL, which will later launch the kernel and the root filesystem of the OS. (5) [Case 2] If the comparison between the hash value fails, which means the OS kernel might have been altered or compromised. Then in this case, the Secure Module will not send back the handoff signal and warn the user about the potential threat. **** ## III. Secure Boot with Asymmetric Multiprocessing &nbsp;&nbsp;&nbsp;&nbsp; In this section, an introduction ranging from the basic explination of asymmetric multiprocessing to the implementation of secure boot under asymmetric multiprocessing architecture with Xilinx ZedBoard will be covered. ### 1. Brief Introduction of Asymmetric Multiprocssing &nbsp;&nbsp;&nbsp;&nbsp;An *Asymmetric Multiprocessing* (Abbreviated as *AMP* or *ASMP*) system is a multiprocessor computer system where not all of the multiple interconnected central processing units (CPUs) are treated equally [2][3]. AMP is the opposite concept of symmetric multiprocessing (SMP) where two or more CPUs are connected to a single shared main memory and are controlled by a single operating system (OS) [4]. <center> ![](https://i.imgur.com/BbEYyo4.png) Fig.4 Comparison between SMP and AMP systems [5] </center> &nbsp;&nbsp;&nbsp;&nbsp;AMP is merely a concept, the architecture of an AMP system is majorly guided by its implemented hardware architecture. <center> <img src="https://i.imgur.com/ixttqND.png" width=75%/> Fig. 5. AMP example 1 [1][6] <br> <img src="https://i.imgur.com/ruEtPJ6.png" width=75%/> Fig. 6. AMP example 2 [6] </center> &nbsp;&nbsp;&nbsp;&nbsp;For example, there are two AMP implementation architecture shown above. (1) In example 1, both of the processors share one physical memory, while one processor is dedicated for handling the I/Os. This provides a better seperation for attacks aiming for intruding the system through the I/Os since every signal entering or leaving the system must be checked by the processor on the right. (2) In example 2, both of the processors in the system have their own physical private memory while having one physical memory in common. Compared to example 1, the system in this example presents a better memory isolation beteen the processors in the system for that each of them have their own private memory, which can be utilized for temporarily storing sensative data without the other processor's access. ### 2. Implemented Architecture of Secure Boot with AMP &nbsp;&nbsp;&nbsp;&nbsp;For the convenience of implementing the demontration, the secure boot for TBox demonstrarion was conducted with the help of the AMP architecture supported by Xilinx development tools (Vivado Design Suite and Xilinx ZedBoard with Zynq-7000 SoC). <center> ![](https://i.imgur.com/PNPKdb0.png) Fig. 7. Architecture of implemented ZedBoard AMP projects </center> &nbsp;&nbsp;&nbsp;&nbsp; As demonstrated above, the processing system of Zynq-7000 SoC majorly comprises of a dual-core ARM Cortex-A9 processor, which is suitable for developing simple projects with AMP architecture. As Boot ROM code is designated to be executed on a certain core, the core appointed to run the ROM code is brought up first after power-on of the development board, and therefore named as *CPU0*. *CPU0* is responsible for running not only the ROM code, which will bring about the execution of the FSBL, but also the initialization of *CPU1*, which namely wakes the CPU1 up from a low-power wait-for-event (WFE) mode. Only after the *CPU1* is succesffuly waken by *CPU0* can it perform its designated tasks. &nbsp;&nbsp;&nbsp;&nbsp; As for the DDR memory, there are actually two 256-MB memory chips on board, but they are accessed by the DDR memory controller as a 512-MB memory as a whole. Therefore for a better isolation between the two working cores, the memory is highly recommended to be parted into (1) private memory of each CPUs and (2) shared memory where both of the CPUs can access when implementing an AMP project on the ZedBoard. &nbsp;&nbsp;&nbsp;&nbsp; To better acquire the users with some further understanding of how AMP projects are implemented on Xilinx products, Xilinx actually provides several documents for users to start with. Including: (1) Bare metal + bare-metal AMP project [7] (2) Linux + bare-metal AMP project [8] &nbsp;&nbsp;&nbsp;&nbsp;For readers that are interested in such AMP projects, please refer to the references provided. ## References [1] "Zynq-7000 FSBL", Xilinx Wiki, Accessed on: Aug. 11, 2021. [Online]. Available: https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/439124055/Zynq-7000+FSBL#Zynq-7000FSBL-WhatarethevarioushooksprovidedinFSBLcode%3F [2] “Asymmetric multiprocessing”, Wikipedia, Accessed on: Jul. 17, 2021. [Online]. Available: https://en.wikipedia.org/wiki/Asymmetric_multiprocessing [3] SHUBHAMSINGH10, Difference between Asymmetric and Symmetric Multiprocessing, GeeksforGeeks, Nov. 25, 2020. Accessed on: Jul. 17, 2021. [Online]. Available: https://www.geeksforgeeks.org/difference-between-asymmetric-and-symmetric-multiprocessing/ [4] “Symmetric multiprocessing”, Wikipedia, Accessed on: Jul. 17, 2021. [Online]. Available: https://en.wikipedia.org/wiki/Symmetric_multiprocessing [5] Yuan Gu, Xilinx Inc., 為無線應用可編程 SoC選擇作業系統的關鍵考量, Electronics Engineering Times, Taiwan, 2015, Accessed on: Jul. 17, 2021. [Online]. Available: http://www.eettaiwan.com/ART_8800709898_617723_AN_d1e403f8.HTM [6] Peter Wendt , ASMP vs SMP, The MCA-Enthusiasts, 2003, Accessed on: Jun. 10, 2021. [Online]. Available: http://ohlandl.ipv7.net/CPU/ASMP_SMP.html [7] John McDougall, Simple AMP: Bare-Metal System Running on Both Cortex-A9 Processors, Xilinx Inc., XAPP1079 (v1.0.1), Jan. 24, 2014, Accessed on: Jul. 17, 2021. [Online]. Available: https://www.xilinx.com/support/documentation/application_notes/xapp1079-amp-bare-metal-cortex-a9.pdf [8] John McDougall, Simple AMP Running Linux and Bare-Metal System on Both Zynq SoC Processors, Xilinx Inc., XAPP1078 (v1.0), Feb. 14, 2013, Accessed on: Jul. 17, 2021. [Online]. Available: https://www.xilinx.com/support/documentation/application_notes/xapp1078-amp-linux-bare-metal.pdf