# [Exp] Highlights in NVMain source code
###### tags: `research-GraphRC`
[TOC]
## Demystify NVMain column
- Comparison: Column in RC-NVM and DRAM
- DRAM column 
- For a x2 DRAM, each column in a subarray contains 2 bits
- For a x4 DRAM, each column in a subarray contains 4 bits
- 
- Both RC-NVM and traditional DRAM operate in a similar manner
- Justify my guessing on configuration setting (Proven wrong after the author gave us source code for RC_NVM)
```c=1
;********************************************************************************
; General memory system configuration
; Number of banks per rank
BANKS 8
; Number of ranks per channel
; for 3D DRAM, rank and channel is interchangable
RANKS 4
; Number of channels in the system
CHANNELS 2
; Number of rows in one bank
ROWS 8192
; Number of VISIBLE columns in one bank (real column number = COLS * DeviceWidth)
COLS 1024
; whether enable sub-array level parallelism (SALP)
; options:
; No SALP: MATHeight = ROWS
; SALP: number of subarrays = ROWS / MATHeight
MATHeight 1024
...
; address mapping scheme
; options: SA:R:RK:BK:CH:C (SA-Subarray, R-row, C:column, BK:bank, RK:rank, CH:channel)
AddressMappingScheme RK:SA:BK:CH:R:C
```
## Gem5 trace format
|Field Name| CYCLE | OP | ADDRESS | DATA (64B) | THREADID |
|--------- | ----- | -- | ------- |----------- | -------- |
|Field Num | 0 | 1 | 2 | 3 | 4 |
|Instr |0 |R | 0x180 | 00e8da000000e8d5eb05004883c408c331ed4989d15e4889e24883e4f0505449c7c0a00a400048c7c1e00a400048c7c7a4024000e8a7010000f490904883ec08| 0|
| |6 |R | 0x1c0 | 488b05112f28004885c07402ffd04883c408c390909090909090909090909090554889e5534883ec08803db03b280000755fbb48306800488b05aa3b28004881| 0|
| |7 |R | 0x87e60 | 0100000000000000d2efffffff7f0000000000000000000000000000000000002100000000000000000060ffffffffff1000000000000000fffb8b0700000000| 0|
| |9 |R | 0x340 | bf07004600e8960c0000b800000000c9c39090909090909090909090909090904157b8000000004d89c741564989ce415541544d89cc55534881ec7802000048| 0|
| |19 |R | 0x380 | 8954240831d24885c048897c2418897424140f85a902000048637c2414488b442408891560352800488d7cf808488b8424b002000048893d24552800488905a5| 0|
## Testing environment
- Ubuntu 16.04
- gcc 5
- [Update] Should work on Ubuntu 20.04 with gcc 9.3.0
## NVMain structure
- 
### Address mapping
- Input: Physical address mapping scheme
- i.e. Channel=1, Rank=1, Row=10, Bank=3, Subarray=6, Col=8
- Output: Ordered address mapping
- i.e. order = {3, 0, 2, 4, 5, 1}; // R, C, BA, RA, CH, SA
- [src/TranslationMethod.cpp](https://github.com/WeiCheng14159/NVmain/blob/master/src/TranslationMethod.cpp)
### Address Translation
- Input: Physical address
- Output: Corresponding memory domain (i.e. channel, rank, bank, row, col)
- [src/AddressTranslator.cpp](https://github.com/WeiCheng14159/NVmain/blob/ad28c0c72d0d4eb838b2c3b3f779ca1e4a67cce3/src/AddressTranslator.cpp#L185)
:::spoiler code
```cpp=1
/*
* Translate() translates the physical address and returns the corresponding
* values of each memory domain
*/
void AddressTranslator::Translate( uint64_t address, uint64_t *row, uint64_t *col, uint64_t *bank,
uint64_t *rank, uint64_t *channel, uint64_t *subarray )
{
uint64_t refAddress;
MemoryPartition part;
uint64_t *partitions[6] = { row, col, bank, rank, channel, subarray };
if( GetTranslationMethod( ) == NULL )
{
std::cerr << "Divider Translator: Translation method not specified!" << std::endl;
return;
}
int busOffsetBits = mlog2( busWidth / 8 );
int burstBits = mlog2( (busWidth * burstLength) / 8 );
lowColBits = burstBits - busOffsetBits;
/* first of all, truncate the bus offset bits */
refAddress = address >> busOffsetBits;
/* then, truncate the lowest column bits */
refAddress >>= lowColBits;
/* 0->4, low to high, FindOrder() will find the correct one */
for( int i = 0; i < 6; i++ )
{
FindOrder( i, &part );
/*
* The new memsize does not include this partition, so dividing by the
* new memsize will give us the right channel/rank/bank/whatever.
*/
*partitions[part] = Modulo( refAddress, part );
/*
* "Mask off" the first partition number we got. For example if memsize = 1000
* and the address is 8343, the partition would be 8, and we will look at 343
* to determine the rest of the address now.
*/
refAddress = Divide( refAddress, part );
}
}
```
:::
### Memory Controller
- Schedule memory request
- [src/MemoryController.cpp](https://github.com/WeiCheng14159/NVmain/blob/ad28c0c72d0d4eb838b2c3b3f779ca1e4a67cce3/src/MemoryController.cpp#L231)
:::spoiler code
```cpp=1
void MemoryController::ScheduleCommandWake( )
{
/* Schedule wake event for memory commands if not scheduled. */
ncycle_t nextWakeup = NextIssuable( NULL );
/* Avoid scheduling multiple duplicate events. */
bool nextWakeupScheduled = GetEventQueue()->FindCallback( this,
(CallbackPtr)&MemoryController::CommandQueueCallback,
nextWakeup, NULL, commandQueuePriority );
if( !nextWakeupScheduled )
{
GetEventQueue( )->InsertCallback( this,
(CallbackPtr)&MemoryController::CommandQueueCallback,
nextWakeup, NULL, commandQueuePriority );
}
}
```
:::
### Rank
- `Rank` is a subclass of `NVMObject`
- `StandardRank` is a subclass of `Rank`
- [Ranks/StandardRank/StandardRank.cpp](https://github.com/WeiCheng14159/NVmain/blob/master/Ranks/StandardRank/StandardRank.cpp)
- `void StandardRank::Cycle( ncycle_t steps )`
- :::spoiler code
```cpp=
void StandardRank::Cycle( ncycle_t steps )
{
for( ncounter_t childIdx = 0; childIdx < GetChildCount( ); childIdx++ )
GetChild( childIdx )->Cycle( steps );
/* Count cycle numbers and calculate background energy for each state */
switch( state )
{
/* active powerdown */
case STANDARDRANK_PDA:
fastExitActiveCycles += steps;
if( p->EnergyModel == "current" )
backgroundEnergy += ( p->EIDD3P * (double)steps ) * (double)deviceCount;
else
backgroundEnergy += ( p->Epda * (double)steps );
break;
/* precharge powerdown fast exit */
case STANDARDRANK_PDPF:
fastExitPrechargeCycles += steps;
if( p->EnergyModel == "current" )
backgroundEnergy += ( p->EIDD2P1 * (double)steps ) * (double)deviceCount;
else
backgroundEnergy += ( p->Epdpf * (double)steps );
break;
/* precharge powerdown slow exit */
case STANDARDRANK_PDPS:
slowExitCycles += steps;
if( p->EnergyModel == "current" )
backgroundEnergy += ( p->EIDD2P0 * (double)steps ) * (double)deviceCount;
else
backgroundEnergy += ( p->Epdps * (double)steps );
break;
/* active standby */
case STANDARDRANK_REFRESHING:
case STANDARDRANK_OPEN:
activeCycles += steps;
if( p->EnergyModel == "current" )
backgroundEnergy += ( p->EIDD3N * (double)steps ) * (double)deviceCount;
else
backgroundEnergy += ( p->Eactstdby * (double)steps );
break;
/* precharge standby */
case STANDARDRANK_CLOSED:
standbyCycles += steps;
if( p->EnergyModel == "current" )
backgroundEnergy += ( p->EIDD2N * (double)steps ) * (double)deviceCount;
else
backgroundEnergy += ( p->Eprestdby * (double)steps );
break;
default:
if( p->EnergyModel == "current" )
backgroundEnergy += ( p->EIDD2N * (double)steps ) * (double)deviceCount;
else
backgroundEnergy += ( p->Eprestdby * (double)steps );
break;
}
}
```
:::
- `bool StandardRank::IsIssuable( NVMainRequest *req, FailReason *reason )`
- :::spoiler code
```cpp=
bool StandardRank::IsIssuable( NVMainRequest *req, FailReason *reason )
{
uint64_t opBank;
bool rv;
req->address.GetTranslatedAddress( NULL, NULL, &opBank, NULL, NULL, NULL );
rv = true;
if( req->type == ACTIVATE )
{
if( nextActivate > GetEventQueue( )->GetCurrentCycle( )
|| ( lastActivate[(RAWindex + 1) % rawNum] + p->tRAW )
> GetEventQueue()->GetCurrentCycle() )
{
rv = false;
if( reason )
reason->reason = RANK_TIMING;
}
else
{
rv = GetChild( req )->IsIssuable( req, reason );
}
if( rv == false )
{
if( nextActivate > GetEventQueue( )->GetCurrentCycle( ) )
{
actWaits++;
actWaitTotal += nextActivate - GetEventQueue( )->GetCurrentCycle( );
}
if( ( lastActivate[RAWindex] + p->tRRDR )
> GetEventQueue( )->GetCurrentCycle( ) )
{
rrdWaits++;
rrdWaitTotal += ( lastActivate[RAWindex] +
p->tRRDR - (GetEventQueue()->GetCurrentCycle()) );
}
if( ( lastActivate[( RAWindex + 1 ) % rawNum] + p->tRAW )
> GetEventQueue( )->GetCurrentCycle( ) )
{
fawWaits++;
fawWaitTotal += ( lastActivate[( RAWindex + 1 ) % rawNum] +
p->tRAW - GetEventQueue( )->GetCurrentCycle( ) );
}
}
}
else if( req->type == READ || req->type == READ_PRECHARGE )
{
if( nextRead > GetEventQueue( )->GetCurrentCycle( ) )
{
rv = false;
if( reason )
reason->reason = RANK_TIMING;
}
else
{
rv = GetChild( req )->IsIssuable( req, reason );
}
}
else if( req->type == WRITE || req->type == WRITE_PRECHARGE )
{
if( nextWrite > GetEventQueue( )->GetCurrentCycle( ) )
{
rv = false;
if( reason )
reason->reason = RANK_TIMING;
}
else
{
rv = GetChild( req )->IsIssuable( req, reason );
}
}
else if( req->type == PRECHARGE || req->type == PRECHARGE_ALL )
{
if( nextPrecharge > GetEventQueue( )->GetCurrentCycle( ) )
{
rv = false;
if( reason )
reason->reason = RANK_TIMING;
}
else
{
rv = GetChild( req )->IsIssuable( req, reason );
}
}
else if( req->type == POWERDOWN_PDA
|| req->type == POWERDOWN_PDPF
|| req->type == POWERDOWN_PDPS )
{
rv = CanPowerDown( req );
if( !rv && reason )
reason->reason = RANK_TIMING;
}
else if( req->type == POWERUP )
{
rv = CanPowerUp( req );
if( !rv && reason )
reason->reason = RANK_TIMING;
}
else if( req->type == REFRESH )
{
/* firstly, check whether REFRESH can be issued to a rank */
if( nextActivate > GetEventQueue()->GetCurrentCycle()
|| ( lastActivate[( RAWindex + 1 ) % rawNum] + p->tRAW
> GetEventQueue( )->GetCurrentCycle( ) ) )
{
rv = false;
if( reason )
reason->reason = RANK_TIMING;
return rv;
}
/* REFRESH can only be issued when all banks in the group are issuable */
assert( (opBank + banksPerRefresh) <= bankCount );
for( ncounter_t i = 0; i < banksPerRefresh; i++ )
{
rv = GetChild( opBank + i )->IsIssuable( req, reason );
if( rv == false )
return rv;
}
}
else
{
/* Unknown command -- See if child module can handle it. */
rv = GetChild( req )->IsIssuable( req, reason );
}
return rv;
}
```
:::
- `bool StandardRank::IssueCommand( NVMainRequest *req )`
- :::spoiler code
```cpp=
bool StandardRank::IssueCommand( NVMainRequest *req )
{
bool rv = false;
if( !IsIssuable( req ) )
{
...
}
else
{
rv = true;
switch( req->type )
{
case ACTIVATE:
rv = this->Activate( req );
break;
case READ:
case READ_PRECHARGE:
rv = this->Read( req );
break;
case WRITE:
case WRITE_PRECHARGE:
rv = this->Write( req );
break;
case PRECHARGE:
case PRECHARGE_ALL:
rv = this->Precharge( req );
break;
case POWERDOWN_PDA:
case POWERDOWN_PDPF:
case POWERDOWN_PDPS:
rv = this->PowerDown( req );
break;
case POWERUP:
rv = this->PowerUp( req );
break;
case REFRESH:
rv = this->Refresh( req );
break;
default:
std::cout << "NVMain: Rank: Unknown operation in command queue! "
<< req->type << std::endl;
break;
}
}
return rv;
}
```
- `bool StandardRank::Activate( NVMainRequest *request )`
- :::spoiler code
```cpp=
bool StandardRank::Activate( NVMainRequest *request )
{
...
/*
* Ensure that the time since the last bank activation is >= tRRD. This is to limit
* power consumption.
*/
if( nextActivate <= GetEventQueue()->GetCurrentCycle()
&& lastActivate[( RAWindex + 1 ) % rawNum] + p->tRAW
<= GetEventQueue( )->GetCurrentCycle( ) )
{
/* issue ACTIVATE to target bank */
GetChild( request )->IssueCommand( request );
if( state == STANDARDRANK_CLOSED )
state = STANDARDRANK_OPEN;
/* move to the next counter */
RAWindex = (RAWindex + 1) % rawNum;
lastActivate[RAWindex] = GetEventQueue()->GetCurrentCycle();
nextActivate = MAX( nextActivate,
GetEventQueue()->GetCurrentCycle() + p->tRRDR );
}
else
{
std::cerr << "NVMain Error: Rank Activation FAILED! "
<< "Did you check IsIssuable?" << std::endl;
}
return true;
}
```
:::
- `bool StandardRank::Read( NVMainRequest *request )`
- :::spoiler code
```cpp=
bool StandardRank::Read( NVMainRequest *request )
{
...
/* issue READ or READ_PRECHARGE to target bank */
bool success = GetChild( request )->IssueCommand( request );
/* Even though the command may be READ_PRECHARGE, it still works */
nextRead = MAX( nextRead,
GetEventQueue()->GetCurrentCycle()
+ MAX( p->tBURST, p->tCCD ) * request->burstCount );
nextWrite = MAX( nextWrite,
GetEventQueue()->GetCurrentCycle()
+ MAX( p->tBURST, p->tCCD ) * (request->burstCount - 1)
+ p->tCAS + p->tBURST + p->tRTRS - p->tCWD );
...
}
```
:::
### Bank
- `Bank` is a subclass of `NVMObject`
- `StandardBank` is a subclass of `Bank`
- [Banks/DDR3Bank/DDR3Bank.cpp](https://github.com/WeiCheng14159/NVmain/blob/master/Banks/DDR3Bank/DDR3Bank.cpp)
- `bool DDR3Bank::IsIssuable( NVMainRequest *req, FailReason *reason )`
:::spoiler code
```cpp=
bool DDR3Bank::IsIssuable( NVMainRequest *req, FailReason *reason )
{
bool rv = true;
uint64_t opRank, opBank, opRow, opSubArray;
req->address.GetTranslatedAddress( &opRow, NULL, &opBank, &opRank, NULL, &opSubArray );
if( nextCommand != CMD_NOP )
return false;
if( req->type == ACTIVATE )
{
/* if the bank-level nextActive is not satisfied, cannot issue */
if( nextActivate > ( GetEventQueue()->GetCurrentCycle() )
|| state == DDR3BANK_PDPF || state == DDR3BANK_PDPS || state == DDR3BANK_PDA )
{
rv = false;
if( reason )
reason->reason = BANK_TIMING;
actWaits++;
actWaitTotal += nextActivate - (GetEventQueue()->GetCurrentCycle());
}
else
{
rv = GetChild( req )->IsIssuable( req, reason );
}
}
else if( req->type == READ || req->type == READ_PRECHARGE )
{
if( nextRead > (GetEventQueue()->GetCurrentCycle())
|| state != DDR3BANK_OPEN )
{
rv = false;
if( reason )
reason->reason = BANK_TIMING;
}
else
{
rv = GetChild( req )->IsIssuable( req, reason );
}
}
else if( req->type == WRITE || req->type == WRITE_PRECHARGE )
{
if( nextWrite > (GetEventQueue()->GetCurrentCycle())
|| state != DDR3BANK_OPEN )
{
rv = false;
if( reason )
reason->reason = BANK_TIMING;
}
else
{
rv = GetChild( req )->IsIssuable( req, reason );
}
}
else if( req->type == PRECHARGE || req->type == PRECHARGE_ALL )
{
if( nextPrecharge > (GetEventQueue()->GetCurrentCycle())
|| ( state != DDR3BANK_CLOSED && state != DDR3BANK_OPEN ) )
{
rv = false;
if( reason )
reason->reason = BANK_TIMING;
}
else
{
if( req->type == PRECHARGE_ALL )
{
std::deque<ncounter_t>::iterator it;
for( it = activeSubArrayQueue.begin();
it != activeSubArrayQueue.end(); ++it )
{
rv = GetChild( (*it) )->IsIssuable( req, reason );
if( rv == false )
break;
}
}
else
{
rv = GetChild( req )->IsIssuable( req, reason );
}
}
}
else if( req->type == POWERDOWN_PDA
|| req->type == POWERDOWN_PDPF
|| req->type == POWERDOWN_PDPS )
{
if( nextPowerDown > (GetEventQueue()->GetCurrentCycle())
|| ( state != DDR3BANK_CLOSED && state != DDR3BANK_OPEN )
|| ( ( req->type == POWERDOWN_PDPF || req->type == POWERDOWN_PDPS )
&& state == DDR3BANK_OPEN ) )
{
rv = false;
if( reason )
reason->reason = BANK_TIMING;
}
for( ncounter_t saIdx = 0; saIdx < subArrayNum; saIdx++ )
{
if( !GetChild(saIdx)->IsIssuable( req ) )
{
rv = false;
break;
}
}
}
else if( req->type == POWERUP )
{
if( nextPowerUp > (GetEventQueue()->GetCurrentCycle())
|| ( state != DDR3BANK_PDPF && state != DDR3BANK_PDPS && state != DDR3BANK_PDA ) )
{
rv = false;
if( reason )
reason->reason = BANK_TIMING;
}
for( ncounter_t saIdx = 0; saIdx < subArrayNum; saIdx++ )
{
if( !GetChild(saIdx)->IsIssuable( req ) )
{
rv = false;
break;
}
}
}
else if( req->type == REFRESH )
{
if( nextActivate > ( GetEventQueue()->GetCurrentCycle() )
|| ( state != DDR3BANK_CLOSED && state != DDR3BANK_OPEN ) )
{
rv = false;
if( reason )
reason->reason = BANK_TIMING;
}
else
{
rv = GetChild( req )->IsIssuable( req, reason );
}
}
else
{
/* Unknown command, just ask child modules. */
rv = GetChild( req )->IsIssuable( req, reason );
}
return rv;
}
```
:::
### NVMObject
- This is the MOST important class in this simulator
- Nearly ALL components are derived from this class
```cpp=
class NVMObject
{
public:
NVMObject( );
virtual ~NVMObject( );
virtual void AddChild( NVMObject *c );
NVMObject *_FindChild( NVMainRequest *req, const char *childClass );
NVMObject_hook *GetParent( );
std::vector<NVMObject_hook *>& GetChildren( );
NVMObject_hook *GetChild( NVMainRequest *req );
NVMObject_hook *GetChild( ncounter_t child );
NVMObject_hook *GetChild( );
ncounter_t GetChildId( NVMObject *c );
ncounter_t GetChildCount( );
...
protected:
NVMObject_hook *parent;
AddressTranslator *decoder;
...
std::vector<NVMObject_hook *> children;
std::vector<NVMObject *> *hooks;
HookType hookType, currentHookType;
};
```
## Tracing code execution in NVMain
- main function
```cpp=
int main( int argc, char *argv[] )
{
TraceMain *traceRunner = new TraceMain( );
return traceRunner->RunTrace( argc, argv );
}
```
- int TraceMain::RunTrace( int argc, char *argv[] )
```cpp=
int TraceMain::RunTrace( int argc, char *argv[] )
{
...
AddChild( nvmain );
nvmain->SetParent( this );
...
nvmain->SetConfig( config, "defaultMemory", true );
...
```
- void NVMObject::AddChild( NVMObject *c )
```cpp=
void NVMObject::AddChild( NVMObject *c )
{
NVMObject_hook *hook = new NVMObject_hook( c );
std::vector<NVMObject *>::iterator it;
/*
* Copy all hooks from the parent to the child NVMObject.
*/
for( int i = 0; i < static_cast<int>(NVMHOOK_COUNT); i++ )
{
for( it = hooks[i].begin(); it != hooks[i].end(); it++ )
{
c->AddHook( (*it) );
}
}
children.push_back( hook );
}
```
### Function calling graph
```graphviz
digraph g {
graph [ rankdir = "LR" ];
node [ fontsize = "16" shape = "record"];
edge [ ];
"traceRunner" [
label = "<f0> traceRunner| <f1>"
];
"nvmain" [
label = "<f0> nvmain| <f1>"
];
"memoryControllers_0" [
label = "<f0> memoryControllers[0]| <f1>"
];
"memoryControllers_1" [
label = "<f0> memoryControllers[1]| <f1>"
];
"InterconnectFactory" [
label = "<f0> InterconnectFactory| <f1>"
];
main -> traceRunner;
traceRunner -> nvmain;
nvmain:f0->memoryControllers_0:f0;
nvmain:f0->memoryControllers_1:f0->InterconnectFactory:f0;
}
```
### Run simulator
- Run command
```bash=1
./nvmain.debug Config/PCM_ISSCC_2012_4GB.config Tests/Traces/hello_world.nvt 0
```
- Simulator calling hierarchy
```c=1
defaultMemory
-- defaultMemory.channel0.FRFCFS-WQF
---- defaultMemory.channel0.FRFCFS-WQF.channel0
------ defaultMemory.channel0.FRFCFS-WQF.channel0.rank0
-------- defaultMemory.channel0.FRFCFS-WQF.channel0.rank0.bank0
---------- defaultMemory.channel0.FRFCFS-WQF.channel0.rank0.bank0.subarray0
-------- defaultMemory.channel0.FRFCFS-WQF.channel0.rank0.bank1
---------- defaultMemory.channel0.FRFCFS-WQF.channel0.rank0.bank1.subarray0
-------- defaultMemory.channel0.FRFCFS-WQF.channel0.rank0.bank2
---------- defaultMemory.channel0.FRFCFS-WQF.channel0.rank0.bank2.subarray0
-------- defaultMemory.channel0.FRFCFS-WQF.channel0.rank0.bank3
---------- defaultMemory.channel0.FRFCFS-WQF.channel0.rank0.bank3.subarray0
...
```