Numpy: Lecture-4 ( 2 hours )

--- title: Views vs Copies (Shallow vs Deep Copy) description: duration: 600 card_type: cue_card --- # Views vs Copies (Shallow vs Deep Copy) - Numpy **manages memory very efficiently**, - which makes it really **useful while dealing with large datasets**. #### But how does it manage memory so efficiently? - Let's create some arrays to understand what's happening in memory while using numpy. Code ```python= import numpy as np ``` Code ```python= # We'll create a np array a = np.arange(4) a ``` array([0, 1, 2, 3]) Code ```python= # Reshape array `a` and store in `b` b = a.reshape(2, 2) b ``` array([[0, 1], [2, 3]]) #### Now we will make some changes to our original array `a`. Code ```python= a[0] = 100 a ``` array([100, 1, 2, 3]) #### What will be values if we print array `b`? Code ```python= b ``` array([[100, 1], [ 2, 3]]) Array **`b` got automatically updated** This is an example of numpy using "Shallow Copy" of data. #### Now, what happens here? - Numpy **re-uses data** as much as possible **instead of duplicating** it. - This helps numpy to be efficient. #### When we created `b=a.reshape(2,2)` - Numpy **did NOT make a copy of `a` to store in `b`**, as we can clearly see. - It is **using the same data as in `a`**. - It **just looks different (reshaped)** in `b`. - That is why, **any changes in `a` automatically gets reflected in `b`**. ### Now, let's see an example where Numpy will create a "Deep Copy" of data. Code ```python= a = np.arange(4) a ``` array([0, 1, 2, 3]) Code ```python= # Create `c` c = a + 2 c ``` array([2, 3, 4, 5]) Code ```python= # We make changes in `a` a[0] = 100 a ``` array([100, 1, 2, 3]) Code ```python= c ``` array([2, 3, 4, 5]) Code ```python= np.shares_memory(a, c) # Deep Copy ``` False #### As we can see, `c` did not get affected on changing `a`. - Because it is an operation. - A more **permanent change in data**. - So, Numpy **had to create a separate copy for `c`** - i.e., **deep copy of array `a` for array `c`**. ### Conclusion: - Numpy is able to **use same data** for **simpler operations** like **reshape** ---> **Shallow Copy**. - It creates a **copy of data** where operations make **more permanent changes** to data ---> **Deep Copy**. --- title: Checking memory sharing using `np.shares_memory()` description: duration: 300 card_type: cue_card --- #### Is there a way to check whether two arrays are sharing memory or not? Yes, there is. `np.shares_memory()` function to the rescue!!! Code ```python= a= np.arange(10) a ``` array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Code ```python= b = a[::2] b ``` array([0, 2, 4, 6, 8]) Code ```python= np.shares_memory(a,b) ``` True > Notice that Slicing creates shallow copies. Code ```python= a[0] = 1000 ``` Code ```python= b ``` array([1000, 2, 4, 6, 8]) Code ```python= a = np.arange(6) a ``` array([0, 1, 2, 3, 4, 5]) Code ```python= b = a[a % 1 == 0] b ``` array([0, 1, 2, 3, 4, 5]) Code ```python= b[0] = 10 ``` Code ```python= a[0] ``` 0 Code ```python= np.shares_memory(a,b) ``` False ``` Memory in Numpy -> - Shallow Copy - Reshaping, Slicing... - Deep Copy - Arithmetic Operations, Masking... ``` Code ```python= a = np.arange(10) ``` Code ```python= a_shallow_copy = a.view() # Creates a shallow copy of a ``` Code ```python= np.shares_memory(a_shallow_copy, a) ``` True Code ```python= a_deep_copy = a.copy() # Creates a deep copy of a ``` Code ```python= np.shares_memory(a_deep_copy, a) ``` False --- title: Quiz-1 description: Quiz-1 duration: 60 card_type: quiz_card --- # Question ```python= a = [0,1,2,3,4,5] b = a[a%1 == 0] b[0] = 10 a[:2] = ? ``` # Choices - [x] [0,1] - [ ] [0,1,2] - [ ] [10,1] - [ ] [10,1,2] --- title: Understanding `.view()` description: duration: 300 card_type: cue_card --- #### `.view()` Returns view of the original array - Any changes made in new array will be reflected in original array. Documentation: <https://numpy.org/doc/stable/reference/generated/numpy.ndarray.view.html> Code ```python= arr = np.arange(10) arr ``` array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Code ```python= view_arr = arr.view() view_arr ``` array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Let's modify the content of `view_arr` and check whether it modified the original array as well. Code ```python= view_arr[4] = 420 view_arr ``` array([ 0, 1, 2, 3, 420, 5, 6, 7, 8, 9]) Code ```python= arr ``` array([ 0, 1, 2, 3, 420, 5, 6, 7, 8, 9]) Code ```python= np.shares_memory(arr, view_arr) ``` True Notice that changes in view array are reflected in original array. --- title: Making Deep Cody using `.copy()` description: duration: 300 card_type: cue_card --- #### How do we make deep copy? Numpy has `.copy()` function for that purpose. #### `.copy()` Returns copy of the array. Documentation (`.copy()`): <https://numpy.org/doc/stable/reference/generated/numpy.ndarray.copy.html#numpy.ndarray.copy> Documentation: (`np.copy()`): <https://numpy.org/doc/stable/reference/generated/numpy.copy.html> Code ```python= arr = np.arange(10) arr ``` array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Code ```python= copy_arr = arr.copy() copy_arr ``` array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Let's modify the content of `copy_arr` and check whether it modified the original array as well. Code ```python= copy_arr[3] = 45 copy_arr ``` array([ 0, 1, 2, 45, 4, 5, 6, 7, 8, 9]) Code ```python= arr ``` array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Notice that the content of original array were not modified as we changed our copy array. ### Summarizing - `.view()` returns shallow copy of an array. - `.copy()` returns deep copy of an array except for object type array. - `copy.deepcopy()` returns deep copy of an array. --- title: Array Splitting description: duration: 300 card_type: cue_card --- # Splitting #### `np.split()` - Splits an array into multiple sub-arrays as views. #### It takes an argument `indices_or_sections`. - If `indices_or_sections` is an **integer, n**, the array will be **divided into n equal arrays along axis**. - If such a split is not possible, an error is raised. - If `indices_or_sections` is a **1-D array of sorted integers**, the entries indicate **where along axis the array is split**. - If an index **exceeds the dimension of the array along axis**, an **empty sub-array is returned** correspondingly. Code ```python= x = np.arange(9) x ``` array([0, 1, 2, 3, 4, 5, 6, 7, 8]) Code ```python= np.split(x, 3) ``` [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])] > **IMPORTANT REQUISITE** -> Number of elements in the array should be divisible by number of sections Code ```python= b = np.arange(10) np.split(b, 3) ``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-30-5033f171e13f> in <cell line: 2>() 1 b = np.arange(10) ----> 2 np.split(b, 3) /usr/local/lib/python3.10/dist-packages/numpy/core/overrides.py in split(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/numpy/lib/shape_base.py in split(ary, indices_or_sections, axis) 870 N = ary.shape[axis] 871 if N % sections: --> 872 raise ValueError( 873 'array split does not result in an equal division') from None 874 return array_split(ary, indices_or_sections, axis) ValueError: array split does not result in an equal division Code ```python= b[0:-1] np.split(b[0:-1], 3) ``` [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])] Code ```python= # Splitting on the basis of exact indices c = np.arange(16) ``` Code ```python= np.split(c, [3, 5, 6]) ``` [array([0, 1, 2]), array([3, 4]), array([5]), array([ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])] --- title: Understanding horizontal and vertical split description: duration: 300 card_type: cue_card --- <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/054/735/original/hvsp1.png?1698041133" width="500" height="350"> #### `np.hsplit()` - Splits an array into multiple sub-arrays **horizontally (column-wise)**. Code ```python= x = np.arange(16.0).reshape(4, 4) x ``` array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]) #### Think of it this way: - There are 2 axis to a 2D array 1. **1st axis - Vertical axis** 2. **2nd axis - Horizontal axis** #### Along which axis are we splitting the array? - The split we want happens across the **2nd axis (Horizontal axis)** - That is why we use `hsplit()` #### So, try to think in terms of "whether the operation is happening along vertical axis or horizontal axis" - We are splitting the horizontal axis in this case. Code ```python= np.hsplit(x, 2) ``` [array([[ 0., 1.], [ 4., 5.], [ 8., 9.], [12., 13.]]), array([[ 2., 3.], [ 6., 7.], [10., 11.], [14., 15.]])] Code ```python= np.hsplit(x, np.array([3, 6])) ``` [array([[ 0., 1., 2.], [ 4., 5., 6.], [ 8., 9., 10.], [12., 13., 14.]]), array([[ 3.], [ 7.], [11.], [15.]]), array([], shape=(4, 0), dtype=float64)] #### `np.vsplit()` - Splits an array into multiple sub-arrays **vertically (row-wise)**. Code ```python= x = np.arange(16.0).reshape(4, 4) x ``` array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]) #### Now, along which axis are we splitting the array? - The split we want happens across the **1st axis (Vertical axis)** - That is why we use `vsplit()` #### Again, always try to think in terms of "whether the operation is happening along vertical axis or horizontal axis" - We are splitting the vertical axis in this case. Code ```python= np.vsplit(x, 2) ``` [array([[0., 1., 2., 3.], [4., 5., 6., 7.]]), array([[ 8., 9., 10., 11.], [12., 13., 14., 15.]])] Code ```python= np.vsplit(x, np.array([3])) ``` [array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]]), array([[12., 13., 14., 15.]])] --- title: Stacking (`vtack`) description: duration: 200 card_type: cue_card --- # Stacking (`vstack`) Code ```python= a = np.arange(1, 5) b = np.arange(2, 6) c = np.arange(3, 7) ``` #### `np.vstack()` - Stacks a list of arrays **vertically (along axis 0 or 1st axis)**. - For **example**, **given a list of row vectors, appends the rows to form a matrix**. Code ```python= np.vstack([b, c, a]) ``` array([[2, 3, 4, 5], [3, 4, 5, 6], [1, 2, 3, 4]]) Code ```python= a = np.arange(1, 5) b = np.arange(2, 4) c = np.arange(3, 10) ``` Code ```python= np.vstack([b, c, a]) ``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-40-5148cb6ebc5f> in <cell line: 1>() ----> 1 np.vstack([b, c, a]) /usr/local/lib/python3.10/dist-packages/numpy/core/overrides.py in vstack(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/numpy/core/shape_base.py in vstack(tup) 280 if not isinstance(arrs, list): 281 arrs = [arrs] --> 282 return _nx.concatenate(arrs, 0) 283 284 /usr/local/lib/python3.10/dist-packages/numpy/core/overrides.py in concatenate(*args, **kwargs) ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 2 and the array at index 1 has size 7 --- title: Quiz-2 description: Quiz-2 duration: 60 card_type: quiz_card --- # Question What will be the output of following code? ```python= a = np.array([[1], [2], [3]]) b = np.array([[4], [5], [6]]) np.vstack((a, b)) ``` # Choices - [ ] `array([1, 2, 3, 4, 5, 6])` - [ ] `array([[1, 4], [2, 5], [3, 6]])` - [x] `array([[1], [2], [3], [4], [5], [6]])` - [ ] Error --- title: Stacking (`htack`) description: duration: 200 card_type: cue_card --- ### Explanation of Quiz-2: Code ```python= a = np.array([[1], [2], [3]]) b = np.array([[4], [5], [6]]) np.vstack((a, b)) ``` array([[1], [2], [3], [4], [5], [6]]) # Stacking (`hstack`) Code ```python= a = np.arange(5).reshape(5, 1) ``` Code ```python= b = np.arange(15).reshape(5, 3) ``` Code ```python= np.hstack([a, b]) ``` array([[ 0, 0, 1, 2], [ 1, 3, 4, 5], [ 2, 6, 7, 8], [ 3, 9, 10, 11], [ 4, 12, 13, 14]]) --- title: Quiz-3 description: Quiz-3 duration: 60 card_type: quiz_card --- # Question what will be the output of this? a = np.array([[1], [2], [3]]) b = np.array([[4], [5], [6]]) np.hstack((a, b)) # Choices - [ ] `[[1] [2] [3] [4] [5] [6]]` - [ ] `[[1 2] [3 4] [5 6]]` - [x] `[[1 4] [2 5] [3 6]]` - [ ] `[[4 1] [5 2] [6 3]]` --- title: Understanding `np.concatenate()` description: duration: 300 card_type: cue_card --- ### Explanation of Quiz-3: Code ```python= a = np.array([[1], [2], [3]]) a ``` array([[1], [2], [3]]) Code ```python= b = np.array([[4], [5], [6]]) b ``` array([[4], [5], [6]]) Code ```python= np.hstack((a, b)) ``` array([[1, 4], [2, 5], [3, 6]]) #### This time both `a` and `b` are column vectors. - So, the stacking of `a` and `b` along horizontal axis is more clearly visible. ## `np.concatenate()` - can perform both vstack and hstack - Creates a new array by appending arrays after each other, along a given axis. - Provides similar functionality, but it takes a **keyword argument `axis`** that specifies the **axis along which the arrays are to be concatenated**. #### Input array to `concatenate()` needs to be of dimensions atleast equal to the dimensions of output array. Code ```python= a = np.array([1,2,3]) b = np.array([[1,2,3], [4,5,6]]) ``` Code ```python= a ``` array([1, 2, 3]) Code ```python= b ``` array([[1, 2, 3], [4, 5, 6]]) Code ```python= np.concatenate([a, b], axis = 0) ``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-47-1a93c4fe21df> in <cell line: 1>() ----> 1 np.concatenate([a, b], axis = 0) /usr/local/lib/python3.10/dist-packages/numpy/core/overrides.py in concatenate(*args, **kwargs) ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 2 dimension(s) > **Note**: concatenate can only work if both a and b have the same number of dimensions Code ```python= a = np.array([[1,2,3]]) b = np.array([[1,2,3], [4,5,6]]) ``` Code ```python= np.concatenate([a, b], axis = 0) # axis = 0 -> vstack ``` array([[1, 2, 3], [1, 2, 3], [4, 5, 6]]) Code ```python= a = np.arange(6).reshape(3, 2) b = np.arange(9).reshape(3, 3) ``` Code ```python= np.concatenate([a, b], axis = 1) # axis = 1 -> hstack ``` array([[0, 1, 0, 1, 2], [2, 3, 3, 4, 5], [4, 5, 6, 7, 8]]) Code ```python= a = np.array([[1,2], [3,4]]) b = np.array([[5,6,7,8]]) ``` Code ```python= np.concatenate([a, b], axis = None) # axis = None joins and converts to 1D ``` array([1, 2, 3, 4, 5, 6, 7, 8]) #### Let's look at a few more examples using `np.concatenate()`. #### Question: What will be the output of this? a = np.array([[1, 2], [3, 4]]) b = np.array([[5, 6]]) np.concatenate((a, b), axis=0) Code ```python= a = np.array([[1, 2], [3, 4]]) a ``` array([[1, 2], [3, 4]]) Code ```python= b = np.array([[5, 6]]) b ``` array([[5, 6]]) Code ```python= np.concatenate((a, b), axis=0) ``` array([[1, 2], [3, 4], [5, 6]]) #### How did it work? - Dimensions of `a` is 2 × 2 #### What is the dimensions of `b` ? - 1-D array ?? - **NO** - Look carefully!! - **`b` is a 2-D array of dimensions 1 × 2** #### `axis = 0` ---> It's a vertical axis - So, **changes will happen along vertical axis**. - So, **`b` gets concatenated below `a`**. #### What if we do NOT provide an axis along which to concatenate? Code ```python= a = np.array([[1, 2], [3, 4]]) b = np.array([[5, 6]]) np.concatenate((a, b), axis=None) ``` array([1, 2, 3, 4, 5, 6]) #### Can you see what happened here? - When we **don't specify the axis (`axis=None`)**, - `np.concatenate()` **flattens the arrays and concatenates them as 1-D row array.** --- title: Quiz-4 description: Quiz-4 duration: 60 card_type: quiz_card --- # Question What will be the result of this concatenation operation? ```python= a = np.array([[1, 2], [3, 4]]) b = np.array([[5, 6]]) np.concatenate((a, b.T), axis=1) ``` # Choices - [ ] `[[1, 2], [3, 4], [5, 6]]` - [x] `[[1, 2, 5], [3, 4, 6]]` - [ ] Error --- title: Quiz-4 Explanation description: duration: 60 card_type: cue_card --- ## Explanation: Code ```python= a = np.array([[1, 2], [3, 4]]) b = np.array([[5, 6]]) np.concatenate((a, b.T), axis=1) ``` array([[1, 2, 5], [3, 4, 6]]) #### What happened here? - **Dimensions of `a`** is again 2 × 2 - **Dimensions of `b`** is again 1 × 2 - So, **Dimensions of `b.T`** will be 2 × 1 #### This time, **`axis = 1`** ---> Changes will happen along horizontal axis - So, **`b.T` gets concatenated horizontally to `a`**