# Dynamic Array Growth Factor ## Amortized Analysis ### Increase by a constant factor 2 先以上課講到的兩倍作為例子 | ith | Array | Cost (append + copy) | |---|---|---| | 0 | \[ ] | 0 | | 1 | \[a] | 1 | | 2 | \[a, b] | 1 + 1 = 2 | | 3 | \[a, b, c, \_] | 1 + 2 = 3 | | 4 | \[a, b, c, d] | 1 | | 5 | \[a, b, c, d, e, \_, \_, \_] | 1 + 4 = 5 | 觀察一下即可發現$c_i$的關係式 $$c_i=\begin{cases}i,&\text{if }i-1=2^k.\\1,&\text{otherwise.}\end{cases}$$ 接著要算出total cost,把cost部分分成兩部分看,即可列出下面式子 $$\begin{aligned} \sum_{i=1}^nc_i& =\sum_{i=0}^{\lfloor\log_2\left(n{-}1\right)\rfloor }2^i+\sum_{i=1}^n1 \\ &=2^{\lfloor\log_2\left(n{-}1\right)\rfloor+1}-1+n \\&=2\cdot2^{\lfloor\log_2\left(n{-}1\right)\rfloor}-1+n\\ &\leq 3n \end{aligned}$$ 所以可得出n個append operation的total cost最多為$3n$,, 而對於每個單一append操作的amortized cost最多為3,即O(1) 討論完2倍,接著來探討看看如果每次申請新的空間是以加k大小或k倍的情況 ### Increase by a constant k $\begin{aligned}\text{Total cost} &= n +(1)+(1+k)+(1+2k)+(1+3k) +\space ...\space+(1+(\lceil\frac{n-1}{k}\rceil-1)k) \\&=n + \lceil\frac{n-1}{k}\rceil+\frac{\lceil\frac{n-1}{k}\rceil(\lceil\frac{n-1}{k}\rceil-1)}{2}k\\&=O(n^2) \end{aligned}$ 所以對於每個單一append操作的amortized cost為$O(n)$ ### Increase by a constant factor k $$\begin{aligned} \sum_{i=1}^nc_i& =\sum_{i=0}^{\lfloor\log_k\left(n{-}1\right)\rfloor }k^i+\sum_{i=1}^n1 \\ &=\frac{k^{\lfloor\log_k\left(n{-}1\right)\rfloor+1}-1}{k-1}+n \\ &=\frac{k\cdot k^{\lfloor\log_k\left(n{-}1\right)\rfloor}-1+nk-n}{k-1}\\ &\leq\frac{nk-1+nk-n}{k-1}\\ &\leq\frac{2nk-1}{k-1}\\ &=O(n) \end{aligned}$$ 所以對於每個單一append操作的amortized cost為$O(1)$ ### 結論 & 問題 從上面的推算可得出和Increase by a constant k相比,Increase by a constant factor k 時會比較好,但接著又衍生一個問題,那constant factor有沒有什麼比較好的選擇? 從時間複雜度來看,做n次append操作的複雜度皆為O(n),不論k多大對於每個單一append操作的amortized cost皆為O(1),而k只要不要差太多彼此的常數可能只會差一些些 ## Constant factor choice 從memory角度來看,我們會希望在經過N次resize後,可以重複利用前N-1次resize釋放的空間 **以2倍來說:** 1. 先假設現在dynamic array已經被分配16bytes(4個int)的memory 2. 接著要放入第5個int,則申請32bytes的memory,然後釋放16bytes memory,此時有16bytes的holes在memory 3. 接著要放入第9個int,則申請64bytes的memory,然後釋放32bytes memory,此時有48bytes的holes在memory 4. 接著要放入第17個int,則申請128bytes的memory,然後釋放64bytes memory,此時有112bytes的holes在memory **以1.5倍來說:** 1. 先假設現在dynamic array已經被分配16bytes(4個int)的memory 2. 接著要放入第5個int,則申請24bytes的memory,然後釋放16bytes memory,此時有16bytes的holes在memory 3. 接著要放入第7個int,則申請36bytes的memory,然後釋放24bytes memory,此時有40bytes的holes在memory 4. 接著要放入第10個int,則申請52bytes的memory,然後釋放36bytes memory,此時有76bytes的holes在memory 5. 接著要放入第14個int,則申請76bytes的memory,然後釋放52bytes memory,可以發現**76小於等於前面的hole大小76**,所以申請的空間可以來自前面的hole,即**可以重複利用先前釋放空間** 註:這邊申請的bytes是依據現在dynamic array的size決定,而非bytes,否則以int來說36+36/2會等於54,但54bytes的空間無法剛好全部放入int型態的資料 註:這邊一開始分配的空間可以想成是C++ vector的reserve(),然後放入4個int 稍微觀察一下可以發現當倍數>=2時,新申請的空間都會大於之前的hole總和,即無法利用先前釋放的空間 既然這樣那我們應該可以得出合理倍數應該是1<k<2,那為什麼有些情況是使用2的倍數? 這邊以linux系統下g++的C\++為例,使用2的倍數是因為Linux的Buddy system,而Buddy system要求memory的分配必須是2的冪次,所以在linux系統下會是兩倍 先撇除2倍,那接著,我們已經把範圍縮到1<k<2,而這範圍的時間複雜度也不會相差太多,那接下來有沒有辦法從1<k<2中找出最佳解? ### 最佳解 主要目標是讓可以reuse的空間越大越好 這邊先假設constant factor是$x$ 一開始分配的memory大小為$1$,接著新分配的大小為$x$,接著$x^2$,然後要分配$x^3$的空間時會希望他可以重複利用先前$1$和$x$釋放的空間 然後我們可以得到 $1 + x = x^3$ $x \approx 1.32$ 前面的先不重複利用,而接著分配$x^4$的新空間 $1 + x + x^2 = x^4$ $x \approx 1.47$ 歸納一下 $1+x+\cdots+x^{n-2}=x^n$ $\frac{x^{n-1}-1}{x-1} = x^n$ $x^{n-1}-1 = x^{n+1}-x^n$ 當n很大時,1可忽略 $x^{n-1} = x^{n+1}-x^n$ $x^2-x-1=0$ $x \approx 1.618$ 我們即可得出最佳倍數應該為$1.618$,而$1.618$同時也是黃金比例 那為什麼有些Compiler在處理dynamic array時不是以$1.618$這個倍數進行擴張? 首先$1.618$是經由一直先不重複利用前面釋放的memory為前提且N為無限大進行計算, 所以實際上正常來說不會使用$1.618$這個倍數,否則永遠都無法用到之前釋放的memory,就實際上來講倍數通常會建議稍微小於這個數字。 以Windows的MSVC為例,C++的dynamic array的擴張倍數會是$1.5$倍,計算$1.5$這個數字相對$1.618$可以較容易計算(例如經由bit shift操作計算),而$1.5$也和$1.618$較為接近 ![image](https://hackmd.io/_uploads/rJ_XRy2-R.png =300x290) 不同programming language和Compiler,constant factor的範圍都在$1< k\leq2$,從表格可以看得出來Python和Go比較獨樹一格(# 這邊提一下,C\++如果是Windows但是是使用MinGW編譯的話,constant factor會和linux的gcc或g++一樣是2 ## C++ ### Linux 作業系統:`Ubuntu 22.04 LTS` gcc版本:`11.4.0` g++版本:`11.4.0` C++版本:`11` 以下code如果是使用linux的話應該可以在`/usr/include/c++/<version>/`找到 題外話:GNU的coding style看起來好詭異(# `stl_vector.h` ```c= void push_back (const value_type &__x) { if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage) { _GLIBCXX_ASAN_ANNOTATE_GROW (1); _Alloc_traits::construct (this->_M_impl, this->_M_impl._M_finish, __x); ++this->_M_impl._M_finish; _GLIBCXX_ASAN_ANNOTATE_GREW (1); } else _M_realloc_insert (end (), __x); } ``` 可以發現滿的時候如果要`push_back`會調用`_M_realloc_insert (end (), __x)` `vector.tcc` `_M_realloc_insert` ```c= #if __cplusplus >= 201103L template<typename _Tp, typename _Alloc> template<typename... _Args> void vector<_Tp, _Alloc>:: _M_realloc_insert(iterator __position, _Args&&... __args) #else template<typename _Tp, typename _Alloc> void vector<_Tp, _Alloc>:: _M_realloc_insert(iterator __position, const _Tp& __x) #endif { const size_type __len = _M_check_len(size_type(1), "vector::_M_realloc_insert"); pointer __old_start = this->_M_impl._M_start; pointer __old_finish = this->_M_impl._M_finish; const size_type __elems_before = __position - begin(); pointer __new_start(this->_M_allocate(__len)); pointer __new_finish(__new_start); __try { // The order of the three operations is dictated by the C++11 // case, where the moves could alter a new element belonging // to the existing vector. This is an issue only for callers // taking the element by lvalue ref (see last bullet of C++11 // [res.on.arguments]). _Alloc_traits::construct(this->_M_impl, __new_start + __elems_before, #if __cplusplus >= 201103L std::forward<_Args>(__args)...); #else __x); #endif __new_finish = pointer(); #if __cplusplus >= 201103L if _GLIBCXX17_CONSTEXPR (_S_use_relocate()) { __new_finish = _S_relocate(__old_start, __position.base(), __new_start, _M_get_Tp_allocator()); ++__new_finish; __new_finish = _S_relocate(__position.base(), __old_finish, __new_finish, _M_get_Tp_allocator()); } else #endif { __new_finish = std::__uninitialized_move_if_noexcept_a (__old_start, __position.base(), __new_start, _M_get_Tp_allocator()); ++__new_finish; __new_finish = std::__uninitialized_move_if_noexcept_a (__position.base(), __old_finish, __new_finish, _M_get_Tp_allocator()); } } __catch(...) { if (!__new_finish) _Alloc_traits::destroy(this->_M_impl, __new_start + __elems_before); else std::_Destroy(__new_start, __new_finish, _M_get_Tp_allocator()); _M_deallocate(__new_start, __len); __throw_exception_again; } #if __cplusplus >= 201103L if _GLIBCXX17_CONSTEXPR (!_S_use_relocate()) #endif std::_Destroy(__old_start, __old_finish, _M_get_Tp_allocator()); _GLIBCXX_ASAN_ANNOTATE_REINIT; _M_deallocate(__old_start, this->_M_impl._M_end_of_storage - __old_start); this->_M_impl._M_start = __new_start; this->_M_impl._M_finish = __new_finish; this->_M_impl._M_end_of_storage = __new_start + __len; } ``` 簡單來說就是會申請一塊新的__len大小的memory,然後將原本資料複製新的memory中,再將原本資料的memory deallocate 可以發現跟size有關的是14行的 `const size_type __len = _M_check_len(size_type(1), "vector::_M_realloc_insert")` 所以繼續trace看`_M_check_len`如何實現 `vector.tcc` `_M_check_len` ```c= size_type _M_check_len (size_type __n, const char *__s) const { if (max_size () - size () < __n) __throw_length_error (__N (__s)); const size_type __len = size () + (std::max) (size (), __n); return (__len < size () || __len > max_size ()) ? max_size () : __len; } ``` 基本上都會回傳兩倍size()的數值,其他檢查則是避免vector size超過max_size(),max_size()即為vector能存放的最多element個數 ### MSVC 可以在Github看到MSVC對vector的實現:[Github網址](https://github.com/microsoft/STL/blob/a62109595b6d89e08172fdf4beb75a2670fe0cc9/stl/inc/vector) `STL/stl/inc/vector` [872行](https://github.com/microsoft/STL/blob/a62109595b6d89e08172fdf4beb75a2670fe0cc9/stl/inc/vector#L872C1-L879C6) ```c= _CONSTEXPR20 void push_back(const _Ty& _Val) { // insert element at end, provide strong guarantee _Emplace_one_at_back(_Val); } _CONSTEXPR20 void push_back(_Ty&& _Val) { // insert by moving into element at end, provide strong guarantee _Emplace_one_at_back(_STD move(_Val)); } ``` 可以發現如果要`push_back`會調用`_Emplace_one_at_back()` `STL/stl/inc/vector` [775行](https://github.com/microsoft/STL/blob/a62109595b6d89e08172fdf4beb75a2670fe0cc9/stl/inc/vector#L775C1-L786C6) ```c= template <class... _Valty> _CONSTEXPR20 _Ty& _Emplace_one_at_back(_Valty&&... _Val) { // insert by perfectly forwarding into element at end, provide strong guarantee auto& _My_data = _Mypair._Myval2; pointer& _Mylast = _My_data._Mylast; if (_Mylast != _My_data._Myend) { return _Emplace_back_with_unused_capacity(_STD forward<_Valty>(_Val)...); } return *_Emplace_reallocate(_Mylast, _STD forward<_Valty>(_Val)...); } ``` 可以發現滿的時候如果要`push_back`會調用`_Emplace_reallocate()` `STL/stl/inc/vector` [811行](https://github.com/microsoft/STL/blob/a62109595b6d89e08172fdf4beb75a2670fe0cc9/stl/inc/vector#L811C1-L858C6) ```c= template <class... _Valty> _CONSTEXPR20 pointer _Emplace_reallocate(const pointer _Whereptr, _Valty&&... _Val) { // reallocate and insert by perfectly forwarding _Val at _Whereptr _Alty& _Al = _Getal(); auto& _My_data = _Mypair._Myval2; pointer& _Myfirst = _My_data._Myfirst; pointer& _Mylast = _My_data._Mylast; _STL_INTERNAL_CHECK(_Mylast == _My_data._Myend); // check that we have no unused capacity const auto _Whereoff = static_cast<size_type>(_Whereptr - _Myfirst); const auto _Oldsize = static_cast<size_type>(_Mylast - _Myfirst); if (_Oldsize == max_size()) { _Xlength(); } const size_type _Newsize = _Oldsize + 1; const size_type _Newcapacity = _Calculate_growth(_Newsize); const pointer _Newvec = _Al.allocate(_Newcapacity); const pointer _Constructed_last = _Newvec + _Whereoff + 1; pointer _Constructed_first = _Constructed_last; _TRY_BEGIN _Alty_traits::construct(_Al, _Unfancy(_Newvec + _Whereoff), _STD forward<_Valty>(_Val)...); _Constructed_first = _Newvec + _Whereoff; if (_Whereptr == _Mylast) { // at back, provide strong guarantee if constexpr (is_nothrow_move_constructible_v<_Ty> || !is_copy_constructible_v<_Ty>) { _Uninitialized_move(_Myfirst, _Mylast, _Newvec, _Al); } else { _Uninitialized_copy(_Myfirst, _Mylast, _Newvec, _Al); } } else { // provide basic guarantee _Uninitialized_move(_Myfirst, _Whereptr, _Newvec, _Al); _Constructed_first = _Newvec; _Uninitialized_move(_Whereptr, _Mylast, _Newvec + _Whereoff + 1, _Al); } _CATCH_ALL _Destroy_range(_Constructed_first, _Constructed_last, _Al); _Al.deallocate(_Newvec, _Newcapacity); _RERAISE; _CATCH_END _Change_array(_Newvec, _Newsize, _Newcapacity); return _Newvec + _Whereoff; } ``` 可以發現跟size有關的是19行的`_Calculate_growth()` `STL/stl/inc/vector` [1961行](https://github.com/microsoft/STL/blob/a62109595b6d89e08172fdf4beb75a2670fe0cc9/stl/inc/vector#L1961C1-L1977C6) ```c= _CONSTEXPR20 size_type _Calculate_growth(const size_type _Newsize) const { // given _Oldcapacity and _Newsize, calculate geometric growth const size_type _Oldcapacity = capacity(); const auto _Max = max_size(); if (_Oldcapacity > _Max - _Oldcapacity / 2) { return _Max; // geometric growth would overflow } const size_type _Geometric = _Oldcapacity + _Oldcapacity / 2; if (_Geometric < _Newsize) { return _Newsize; // geometric growth would be insufficient } return _Geometric; // geometric growth is sufficient } ``` 基本上都會回傳capacity() + capacity()/2的數值,如果這個數值小於_Newsize(_Oldsize + 1),則回傳_Newsize(當_Oldsize為1時就有可能發生,畢竟1+1/2=1),其他檢查則是避免vector size超過max_size(),max_size()即為vector能存放的最多element個數 ### 結論 可以根據以上trace過程,得出使用gcc或g++時,vector要push_back時若超出現有capacity,則會將capacity\*2,即2倍。而使用MSVC時vector要push_back時若超出現有capacity,則會將capacity變為capacity+capacity/2,即1.5倍 ## Reference [Dynamic array](https://en.wikipedia.org/wiki/Dynamic_array) [Optimal memory reallocation and the golden ratio](https://archive.li/Z2R8w#selection-83.0-83.48) [What is the ideal growth rate for a dynamically allocated array?](https://stackoverflow.com/questions/1100311/what-is-the-ideal-growth-rate-for-a-dynamically-allocated-array) [Reasoning behind specifically doubling the array size upon reallocation](https://cs.stackexchange.com/questions/149713/reasoning-behind-specifically-doubling-the-array-size-upon-reallocation) [Amortized Analysis of Dynamic Arrays](https://www.cs.utexas.edu/~slaberge/docs/topics/amortized/dynamic_arrays/)