2022/09
C Q&A
C Language
(2022/9/11) Collections of Q&A related to C language.
Latest update on 2022/12/27.
Table of Contents
A : from this article - NUMA Get Current Node/Core, it is recommended to use sched_getcpu()
iso getcpu()
.
sched_getcpu() is the most stable way to get cpuid. Since, you were explicitly looking for both cpu and node id, that's why I replied with getcpu(). Actually, getcpu() don't have libc wrapper, you need to use syscalls() system call. And, this is another of reason sched_getcpu() is better than getcpu(), along with portability issues.
getcpu-ex1.c
Terminal
In man sched_getcpu, it says the following. Let's give it a try, but not working. Seems it requires to call getcpu
through syscall
.
getcpu-ex2.c
Terminal
Follow this article Linux System Call Tutorial with C to find below example which works with getcpu()
.
getcpu-ex3.c
Terminal
Further study
This article, Chapter 2. Memory Addressing, is by far the most comprehensive article about x86 memory management I ever found.
CPU | Mode | Addressing | Capacity-Physical | Capacity-Virtual |
---|---|---|---|---|
8086 | Real mode | CS(16bits<<4):IP(16bits) | 1MB (20bits) | NA |
80286 | Real mode | CS(16bits<<4):IP(16bits) | 1MB (20bits) | NA |
Protected Virtual Address mode (PVAM) | 32bits pointer = Selector (16bits) + Offset(16bits) => Segment Base Address (24bits) + Offset(16bits) |
16MB | 1GB | |
80386 | Real mode | CS(16bits<<4):IP(16bits) | 1MB (20bits) | NA |
Protected mode | 32(16+16)/48(16+32)-bit pointer | 4GB (32bits) | 64TB (4GB/Segment X 16K Segments) | |
Virtual 8086 mode | same as 8086 | 1MB | NA |
Figure below shows the format of a descriptor for the 80286 through the Pentium II. Note that each descriptor is 8 bytes in length, so the global and local descriptor tables are each a maximum of 64K bytes in length. Descriptors for the 80286 and the 80386 through the Pentium II differ slightly, but the 80286 descriptor is upward-compatible (with reserved 2 bytes). Though we can see the 'ugly' structure of descriptor in 80386, to be backward compatible with 80286.
Item | CPU | Description | Example | Components | Location |
---|---|---|---|---|---|
Segment Register | 8086 | A 16-bit value of Segmentation is the process in which the main memory of the computer is logically divided into different segments and each segment has its own base address. It shifts 4 bits left, then adding Offset Registers to get physical address. | CS, DS, SS, ES | 16 bit Segment Registers | CPU Segment Registers |
Segment Selector | 80286 | Still a 16-bit value, but it now indexes a table of up to 16M (24bits) Segment Descriptors | CS, DS, SS, ES | 16 bits consistes of 1) 13-bit index value that is used to index the Segment Descriptor table; 2) 1 bit Table Indicator: This is a 1-bit flag that indicates whether the Segment Descriptor table is located in the Global Descriptor Table (GDT) or the Local Descriptor Table (LDT); 3) 2 bit Requested Privilege Level (RPL): This is a 2-bit field that specifies the privilege level of the code or data that is accessing the segment |
CPU Segment Registers referring to Segment Descriptor Table in Memory, using lgdt instruction |
Segment Descriptor | 80286 | Expanded to 24 bits for Base Address and contains additional information such as the segment size and the privilege level of the segment | 24 bits | ||
Segment Selector | 80386 | Still 16 bits, same as 80286, add 2 more Segment Selector Registers, FS and GS. The Segment Descriptor has been further expanded to 32 bits | CS, DS, SS, ES, FS, GS | 16 bits, same as 80286 | Same as 80286 |
Segment Descriptor | 80386 | Has been further expanded to 32 bits Base Address | 32 bits | ||
Offset Register | 8086 | Store the offset through which the actual address is calculated. | (CS:)IP (DS:)BX, DI, SI (SS:)SP, BP (ES:)BX, DI, SI |
16 bits | CPU Offset Registers |
Offset Register | 80286 | Store the offset through which the actual address is calculated. | (CS:)IP (DS:)BX, DI, SI (SS:)SP, BP (ES:)BX, DI, SI |
16 bits | CPU Offset Registers |
Offset Register | 80386 | Store the offset through which the actual address is calculated. | (CS:)EIP (DS:)EBX, EDI, ESI (SS:)ESP, EBP (ES:)EBX, EDI, ESI |
32 bits | CPU Offset Registers |
Segment Selector Format - 80286 and onwards
Segment Descriptor Format between 80286 (total 6 bytes, 2 bytes are reserved) and 80386 (total 8 bytes)
Capacity between 80286 and 80386
GDTR / LDTR Base and Limit
References:
Find another Hackmd x86assemlby for more info related to embedded assembly in C language.
A:
Instruction intrinsics, and inline and embedded assembler are built into the compiler to enable the use of target processor features that cannot normally be accessed directly from C or C++.
Instruction intrinsics
Instruction intrinsics provide a way of easily incorporating target processor features in C and C++ source code without resorting to complex implementations in assembly language. They have the appearance of a function call in C or C++, but are replaced during compilation by assembly language instructions.
Inline assembler
The inline assembler supports interworking with C and C++. Any register operand can be an arbitrary C or C++ expression. The inline assembler also expands complex instructions and optimizes the assembly language code.
Note
The output object code might not correspond exactly to your input because of compiler optimization.
Embedded assembler
The embedded assembler enables you to use the full ARM assembler instruction set, including assembler directives. Embedded assembly code is assembled separately from the C and C++ code. A compiled object is produced that is then combined with the object from the compilation of the C and C++ source.
The following table summarizes the main differences between instruction intrinsics, inline assembler, and embedded assembler.
Table 3-1 Differences between instruction intrinsics, inline and embedded assembler
Feature | Instruction Intrinsics | Inline assembler | Embedded assembler |
---|---|---|---|
Instruction set | ARM and Thumb. | ARM and Thumb. (a) | ARM and Thumb. |
ARM assembler directives | None supported. | None supported. | All supported. |
C/C++ expressions | Full C/C++ expressions. | Full C/C++ expressions. | Constant expressions only. |
Optimization of assembly code | Full optimization. | Full optimization. | No optimization. |
Inlining | Automatically inlined. | Automatically inlined. | Can be inlined by linker if it is the right size and linker inlining is enabled. |
Register access | Physical registers, including PC, LR and SP. | Virtual registers except PC, LR and SP. | Physical registers, including PC, LR and SP. |
Return instructions | Generated automatically. | Generated automatically. BX, BXJ, and BLX instructions are not supported. | You must add them in your code. |
BKPT instruction | Supported. | Not supported. | Supported. |
(a) The inline assembler supports Thumb instructions in ARMv6T2, ARMv6-M, and ARMv7.
A:
A:
Inline Assembly
One of the most common methods for using assembly code fragments in a C programming project is to use a technique called inline assembly. Inline assembly is invoked in different compilers in different ways. Also, the assembly language syntax used in the inline assembly depends entirely on the assembly engine used by the C compiler. Microsoft C++, for instance, only accepts inline assembly commands in MASM syntax, while GNU GCC only accepts inline assembly in GAS syntax (also known as AT&T syntax).
ARM : Can refer to Main page: Embedded Systems/ARM Microprocessors
Practically everyone using ARM processors uses the standard calling convention. This makes mixed C and ARM assembly programming fairly easy, compared to other processors. The simplest entry and exit sequence for Thumb functions is:
The standard C calling convention for ARM is specified in detail by ARM PLC in "Procedure Call Standard for the ARM Architecture".
The simplest entry and exit sequence for 32-bit ARM functions is very similar to Thumb functions:
ARM GCC Inline Assembler Cookbook is a good small article to read through before doing Inline Assembly. It starts with a simple example
More than one assembler instruction in a single inline asm statement.
So far, the assembler instructions are much the same as they'd appear in pure assembly language programs. However, registers and constants are specified in a different way, if they refer to C expressions. The general form of an inline assembler statement is
Also talk about the solution is to add the volatile attribute to the asm statement to instruct the compiler to exclude your assembler code from code optimization. Remember, that you have been warned to use the initial example. Here is the revised version:
Reuse your assembler language parts by defining them as macros and put them into include files. Using such include files may produce compiler warnings, if they are used in modules, which are compiled in strict ANSI mode. To avoid that, you can write asm instead of asm and volatile instead of volatile. These are equivalent aliases. Here is a macro which will convert a long value from little endian to big endian or vice versa:
Macro definitions will include the same assembler code whenever they are referenced. This may not be acceptable for larger routines. In this case you may define a C stub function. Here is the byte swap procedure again, this time implemented as a C function.
This Youtube - Lecture 32. Mixing C and Assembly with ARM Cortext-M MCU provides clear explanation with some examples.
Learn More →
Learn More →
Learn More →