# C++ Orthodox Canonical Form ## What's Orthodox Canonical Form in C++? - It’s the standard set of three special member functions that ensure correct and safe object copying, and destruction: - Copy constructor ```c++ MyClass(const MyClass& other); ``` - Copy assignment operator ```c++ MyClass& operator=(const MyClass& other); ``` - Destructor ```c++ ~MyClass(); ``` ## Why is it needed in C++? - If your class manages resources (eg. raw pointers, file handles, sockets etc.), the default implementation provided by the compiler - Do **shallow copying** (eg. just copying the pointer values) - Can lead to **double deletes, memory leaks, or dangling pointers** - Below are some examples caused by not using the rule of three ### Example 1: Double delete error ```c++ #include <iostream> class Bad { private: int* data; public: // No destructor, no copy constructor, no copy assignment operator Bad(int val) { data = new int(val); } void print() { std::cout << "Value: " << *data << std::endl; } }; int main() { Bad a(42); Bad b = a; // Implicit copy constructor does shallow copy! a.print(); b.print(); // When main ends, both 'a' and 'b' destructors are called, // both will try to delete the same pointer -> DOUBLE DELETE! } ``` - What happens - ```b = a``` uses the compiler-generated copy constructor, which copies the pointer (not the value); - now ```a.data``` and ```b.data``` point to the same memory - When the program exits, both destructors call delete data, leading to double free, which causes undefined behavior ### Example 2: Dangling pointer/ Use after free ```c++ #include <iostream> class Bad { private: int* data; public: // no copy constructor, copy assignment operator Bad(int val) { data = new int(val); } ~Bad() { delete data; } void setData(int val) { *data = val; } void print() { std::cout << "Value: " << *data << std::endl; } }; void corruptMemory(Bad b) { b.setData(999); // Modifies the same memory as original object (shallow copy) } int main() { Bad obj(10); corruptMemory(obj); // Triggers copy constructor (shallow copy) obj.print(); // Data has been changed // When both obj and b go out of scope, both destructors try to delete same pointer } ``` - What happens - ```corruptMemory(obj)``` calls the copy constructor with a shallow copy - Both ```obj``` and ```b``` have pointers to the same memory - Changes in ```b``` affect ```obj```, which is unexpected - On destruction, double delete again ## What does the 3 functions do? - A copy constructor to do a deep copy - A copy assignment operator that frees old memory and copies new - A destructor to clean up properly ### Example: Clean without error - .hpp ```c++ #ifndef BOX_H #define BOX_H class Box { private: int* value; public: Box(int val); // Constructor Box(const Box& other); // Copy constructor Box& operator=(const Box& other); // Copy assignment operator ~Box(); // Destructor void show() const; }; #endif // BOX_H ``` - .cpp ```c++ #include "Box.h" #include <iostream> Box::Box(int val) { std::cout << "Constructor called\n"; value = new int(val); } Box::Box(const Box& other) { std::cout << "Copy constructor called\n"; value = new int(*other.value); // deep copy } Box& Box::operator=(const Box& other) { std::cout << "Copy assignment operator called\n"; if (this != &other) { // avoid self assignment delete value; // cleanup old memory value = new int(*other.value); // deep copy } return *this; } Box::~Box() { std::cout << "Destructor called\n"; delete value; } void Box::show() const { std::cout << "Box value: " << *value << "\n"; } ``` - main.cpp ```c++ int main() { Box a(100); // constructor Box b = a; // copy constructor Box c(999); c = a; // copy assignment a.show(); // Box value: 100 b.show(); // Box value: 100 c.show(); // Box value: 100 return 0; } ``` ### What does Copy Constructor do - Initially - a.value - Point to memory address: 0xABC - Value at that address: 100 - b.value - Point to memory address: ? - Value at that address: ? - c.value - Point to memory address: 0xDEF - Value at that address: 999 - When ```b = a``` called - Allocate new memory address 0xBEE to b.value - Copy value from *a.value to new memory - Now b.value - Point to memory address: 0xBEE (new) - Value at that address: 100 (copied) - Now a and b are independent. Changing one won’t affect the other ### What does Copy Assignment Operator do - When ```c = a``` called - Check if c and a aren't the same object - Delete c.value (0xDEF with 999) - Allocate new memory (eg. 0xCAF) - Copy values from *a.value into new memory - Now c.value - Point to memory address: 0xCAF (new) - Value at that address: 100 (copied) ### What does Destructor do - Free memory, prevent memory leaks ## Dive into Copy Constructor ``` Box(const Box& other){ *this = other; } ``` ### Why reference (&)? ``` Box(Box other); // ❌ BAD: causes infinite recursion ``` - Because passing by value would: - Call the copy constructor to make a copy of the parameter itself - Which would call the copy constructor again... - Which would call the copy constructor again... - 🔁 Infinite recursion → 💥 Compiler stack overflow or crash - So we use a reference to avoid copying the parameter ### Why const? ``` Box(const Box& other); // ✅ Accepts const and non-const Box(Box& other); // ❌ Rejects const objects ``` Because you only want to read from other, not modify it: - The whole point is to copy from other, not change it - Making it const lets you safely pass both const and non-const objects - It also allows using the function with const Box instances ## Dive into Copy Assignment Operator ```c++ // take a const ref to avoid copying // return a non-const ref to self to // support assignment chaining & efficiency Box& Box::operator=(const Box& other) { if (this != &other) { // Do deep copy, cleanup, etc. } return *this; // return reference to the current object } ``` ### Why return ```Box&```? - Because of how assignment works in C++ - This chains assignments, which is legal and expected behavior in C++ ``` Box a, b, c; a = b = c; ``` - What happens internally ``` a = (b = c); ``` - ```b = c``` must return b itself, so it can be assigned to a - That means ```operator=``` must return a reference to the object that was assigned ```(*this)``` - What if return void? - Then the code wouldn't compile ``` a = b = c; // ❌ b = c returns void, can't assign that to a ``` - What if return Box (by value)? ``` Box Box::operator=(const Box& other); // Bad idea ``` - It would make a copy of *this when returning - That invokes the copy constructor, which is expensive and potentially dangerous - Worse: if the copy constructor does another assignment... could get recursion or double copies