# NumPy Debugging Reference
## NumPy Functions
### `np.bincount(x, weights=None, minlength=0)`
Counts occurrences of non-negative ints. **x must be 1D, non-negative integers.**
```python
>>> np.bincount([0, 1, 1, 3, 2, 1])
array([1, 3, 1, 1]) # counts: [0 appears 1x, 1 appears 3x, 2 appears 1x, 3 appears 1x]
>>> np.bincount([0, 1, 1], minlength=5)
array([1, 2, 0, 0, 0]) # force output length
>>> np.bincount([0, 1, 2], weights=[0.5, 1.0, 0.5])
array([0.5, 1.0, 0.5]) # weighted counts
```
**Common bugs:**
- Forgetting `minlength` → output shorter than expected
- Passing negative values → ValueError
- Passing floats → TypeError
---
### `np.nonzero(a)`
Returns **tuple of arrays**, one per dimension, containing indices of non-zero elements.
```python
>>> arr = np.array([0, 2, 0, 3])
>>> np.nonzero(arr)
(array([1, 3]),) # tuple! indices where arr != 0
>>> np.nonzero(arr)[0] # unwrap for 1D
array([1, 3])
>>> mat = np.array([[0, 1], [2, 0]])
>>> np.nonzero(mat)
(array([0, 1]), array([1, 0])) # (row_indices, col_indices)
```
**Common bugs:**
- Using result directly without `[0]` for 1D arrays
- Expecting flat indices instead of tuple of coordinates
---
### `np.where(condition, [x, y])`
Two forms:
1. `np.where(condition)` → same as `np.nonzero(condition)` (returns tuple of indices)
2. `np.where(condition, x, y)` → element-wise: x where True, y where False
```python
>>> arr = np.array([1, -2, 3, -4])
# Form 1: indices only (returns tuple!)
>>> np.where(arr > 0)
(array([0, 2]),)
# Form 2: conditional replacement
>>> np.where(arr > 0, arr, 0)
array([1, 0, 3, 0])
>>> np.where(arr > 0, 'pos', 'neg')
array(['pos', 'neg', 'pos', 'neg'], dtype='<U3')
```
**Common bugs:**
- Using 1-arg form expecting array, getting tuple
- Confusing with `arr[condition]` which returns values, not indices
---
### `np.sum(a, axis=None, keepdims=False)`
```python
>>> arr = np.array([[1, 2], [3, 4]])
>>> np.sum(arr) # all elements
10
>>> np.sum(arr, axis=0) # sum columns (collapse rows)
array([4, 6])
>>> np.sum(arr, axis=1) # sum rows (collapse columns)
array([3, 7])
>>> np.sum(arr, axis=1, keepdims=True) # preserve dims for broadcasting
array([[3],
[7]])
```
**Common bugs:**
- Wrong axis (0=down columns, 1=across rows)
- Missing `keepdims=True` when result needs to broadcast back
---
### `np.min(a, axis=None)` / `np.max(a, axis=None)`
Same axis semantics as `np.sum`. Also have `keepdims` parameter.
```python
>>> arr = np.array([[1, 5], [3, 2]])
>>> np.min(arr, axis=0) # min of each column
array([1, 2])
>>> np.max(arr, axis=1) # max of each row
array([5, 3])
```
---
### `np.random.RandomState(seed)`
Reproducible random number generator.
```python
>>> rng = np.random.RandomState(42)
>>> rng.rand(3) # uniform [0, 1), shape (3,)
array([0.37454012, 0.95071431, 0.73199394])
>>> rng.randint(0, 10, 5) # integers in [0, 10), shape (5,)
array([6, 3, 7, 4, 6])
>>> rng.randn(2, 3) # standard normal, shape (2, 3)
>>> rng.choice([1,2,3], size=2, replace=False) # sample without replacement
>>> rng.shuffle(arr) # in-place shuffle (returns None!)
>>> rng.permutation(arr) # returns shuffled copy
```
**Common bugs:**
- Using `rng.shuffle()` return value (it's None)
- `randint(a, b)` is [a, b) exclusive of b
- Confusing `rand` (uniform) vs `randn` (normal)
---
### Broadcasting Rules
Shapes are compared right-to-left. Dimensions match if equal or one is 1.
```python
(4, 3) + (3,) → (4, 3) ✓ # (3,) broadcasts to (1, 3) then (4, 3)
(4, 3) + (4,) → error ✗ # 3 != 4
(4, 3) + (4, 1) → (4, 3) ✓ # 1 broadcasts to 3
# Common pattern: make (n,) broadcastable to (n, d)
weights = np.array([1, 2, 3]) # shape (3,)
weights[:, None] # shape (3, 1) - broadcasts with (3, d)
weights.reshape(-1, 1) # equivalent
```
---
### Floating Point Comparisons
```python
# BAD
if np.sum(vec**2) == 1.0: ...
# GOOD
if np.isclose(np.sum(vec**2), 1.0): ...
# For arrays
np.allclose(arr1, arr2, rtol=1e-5, atol=1e-8)
```
---
## Python Patterns
### NamedTuple
```python
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
p = Point(1, 2)
p.x # 1
p[0] # 1
p.x = 5 # AttributeError! Immutable
# To "modify":
p2 = p._replace(x=5) # Point(x=5, y=2)
p3 = Point(5, p.y) # equivalent
```
### List Comprehension Order
```python
# Nested loops: outer first, inner second (same order as regular for loops)
[[i*j for j in range(3)] for i in range(2)] # [[0,0,0], [0,1,2]]
# Flattening:
[x for row in matrix for x in row] # row is outer loop
```
### Mutable Default Arguments
```python
# BAD
def __init__(self, items=[]):
self.items = items # shared across all instances!
# GOOD
def __init__(self, items=None):
self.items = items if items is not None else []
```
---
## pdb Debugger
### Starting the Debugger
```python
# Insert at point where you want to break
import pdb; pdb.set_trace()
# Python 3.7+ shorthand
breakpoint()
```
### Essential Commands
| Command | Short | Description |
|---------|-------|-------------|
| `next` | `n` | Execute next line (step over functions) |
| `step` | `s` | Step into function call |
| `continue` | `c` | Continue execution until next breakpoint |
| `return` | `r` | Continue until current function returns |
| `list` | `l` | Show source code around current line |
| `list .` | | Re-center listing on current line |
| `print expr` | `p expr` | Print expression value |
| `pp expr` | | Pretty-print expression |
| `where` | `w` | Show call stack |
| `up` | `u` | Move up one frame in stack |
| `down` | `d` | Move down one frame in stack |
| `quit` | `q` | Quit debugger (and program) |
### Practical Example
```python
def buggy_normalize(arr):
import pdb; pdb.set_trace()
total = np.sum(arr) # after hitting n: check `p total`, `p arr.shape`
return arr / total
# In pdb:
# (Pdb) p arr.shape
# (3, 4)
# (Pdb) p total
# 24.0
# (Pdb) p np.sum(arr, axis=1, keepdims=True)
# array([[6.], [6.], [12.]]) # aha, need axis parameter!
```
### Tips
- Type any Python expression to evaluate it
- Use `!` prefix if command conflicts with pdb command: `!n = 5`
- `interact` drops into full Python shell at current frame
- Set conditional breakpoints in code: `if condition: pdb.set_trace()`
---
## unittest Assertions
```python
import unittest
class TestFoo(unittest.TestCase):
# Equality
self.assertEqual(actual, expected) # actual == expected
self.assertNotEqual(a, b)
# Truthiness
self.assertTrue(x)
self.assertFalse(x)
self.assertIsNone(x)
self.assertIsNotNone(x)
# Identity & Type
self.assertIs(a, b) # a is b
self.assertIsInstance(obj, cls)
# Containers
self.assertIn(item, container) # item in container
self.assertCountEqual(a, b) # same elements, any order
# Numeric
self.assertAlmostEqual(a, b, places=7) # round(a-b, places) == 0
self.assertGreater(a, b) # also: GreaterEqual, Less, LessEqual
# Exceptions
with self.assertRaises(ValueError):
some_function()
# NumPy arrays (use numpy.testing instead!)
np.testing.assert_array_equal(actual, expected)
np.testing.assert_array_almost_equal(actual, expected, decimal=6)
np.testing.assert_allclose(actual, expected, rtol=1e-7, atol=0)
```
### Reading Test Failures
```
AssertionError: Lists differ: [1, 2, 3] != [1, 2, 4]
First differing element 2:
3
4
```
Format is `assertEqual(actual, expected)` — first value is what your code produced.
---
## Quick Debugging Checklist
1. **Read the error message** — line number and exception type
2. **Check shapes** — `print(arr.shape)` liberally
3. **Check types** — `print(type(x))`, especially for tuple vs array
4. **Check values** — edge cases: empty arrays, zeros, negatives
5. **Simplify** — test function with minimal input
6. **Compare against spec** — re-read docstring/test expectations