Midterm 1 Key Concepts - Python Cheat Sheet

# Python Cheat Sheet for Midterm 1 --- ## General Syntax | Operations | Result | |----------|----------| | x in s | True if an item of s is equal to x, else False | | x not in s | False if an item of s is equal to x, else True | | s + t | Concatenation of s and t | | $s\times n$ or $n \times s$ | Equivalent to adding s to itself n times | | s[i] | ith item of s, where the origin is 0 | | s[i:j] | slice of s from i to j | | s[i:j:k] | slice of s from i to j with step k | | len(s) | Length of s, where the origin is 1 (i+1) | | min(s) | smallest item of s | | max(s) | largest item of s | | s.index(x[,i[,j]]) | Index of the first occurrence of x in s at or after i but before j | | s.count(x) | total # of occurrences of x in s | ## Tuples * Ordered - Elements are sequentially arranged according to insertion order * Indexable - Access elements using an index * Immutable - Cannot be changed * Heterogeneous - Can store objects of different data types * Nestable - Can contain other tuples * Iterable - Traverse using a loop or comprehensions while you perform operations * Sliceable - Extract some sequential elements * Combinable - Support concatenations * Hashable - Can work as keys in dictionaries * ### Useful Python Code for Tuples ## Lists * Ordered – Elements are arranged in insertion order * Indexable – Access elements with an index (L[0]) * Mutable – Can be changed (add, remove, modify elements) * Heterogeneous – Can store different data types * Nestable – Can contain other lists (or mixed structures) * Iterable – Traverse using loops or comprehensions * Sliceable – Extract subsequences (L[1:4]) * Combinable – Support concatenation (L1 + L2) and repetition (L * 3) * Duplicate-friendly – Allows repeated elements ### Useful Python Code for Lists * Append vs Extend - use append to add a single item to the end of a list, use extend to add an iterable (list, string, set, etc) to the end of a list: ** Refer to PMT1-F22Q2 ```Python L = [1, 2] L.append(3) # [1, 2, 3] L.extend([3, 4]) # [1, 2, 3, 4] ``` ## Dictionaries (and Defaults) * Unordered (3.6+ ordered by insertion) – Keys maintain insertion order (Python ≥3.7 officially guarantees this) * Key–Value pairs – Store data as mappings {key: value} * Mutable – Values can be added, updated, or deleted * Key restrictions – Keys must be immutable (e.g., str, int, tuple), values can be any type * Unique keys – Duplicate keys not allowed (last assignment wins) * Iterable – Iterate over keys, values, or items (d.keys(), d.values(), d.items()) * Efficient lookup – Fast access by key (d["name"]) * Nestable – Can contain lists, dicts, or sets as values * Hashable keys – Only hashable objects can be keys * To access the keys, values, and items within a dictionary: ```Python keys = dictionary_name.keys() #returns a list of keys values = dictionary_name.values() #returns a list of values items = dictionary_name.items() #returns a list of tuples (key, value) ``` * To default search for a value using a key: ```Python value = dictionary_name["key"] ``` #### Default Dictionary Syntax ```from collections import defaultdict # Create with a "default factory" function d = defaultdict(int) # default 0 for missing keys d = defaultdict(list) # default [] for missing keys d = defaultdict(set) # default set() for missing keys``` ### Useful Python Code for Dictionaries * To add an item to a dictionary: ```Python dictionary_name['key'] = 'value' ``` ## Sets * Unordered – No defined element order * Unindexed – Cannot access by index (s[0] invalid) * Mutable – Can add or remove elements (add, discard, remove) * Unique elements – Automatically removes duplicates * Heterogeneous – Can store different data types (as long as they’re hashable) * Nestable restriction – Cannot contain mutable objects like lists or dicts, but can contain tuples * Iterable – Traverse elements using loops * Set operations – Union (|), Intersection (&), Difference (-), Symmetric difference (^) * Efficient membership test – Fast in checks ### Useful Python Code for Sets * & between two sets = intersection (the elements they share). * | = union (all elements from both). * \- = difference (things in the first but not the second). * ^ = symmetric difference (things in one or the other, but not both). ## JSON Files ## Nested Data Types * PMT1-Fall22Q1 - Contains a Reddit question that displays a JSON file that is a large amount of nested list and dictionaries. ## Regex #### Character Classes * \d → digit (0–9) * \D → non-digit (anything not 0–9) * \w → “word” character = [A-Za-z0-9_] * \W → non-word character * \s → whitespace (space, tab, newline, etc.) * \S → non-whitespace #### Quantifiers * * → 0 or more * + → 1 or more * ? → 0 or 1 (optional) * {n} → exactly n * {n,} → at least n * {n,m} → between n and m #### Anchors / position * ^ → start of string (or line, with re.M) * $ → end of string (or line, with re.M) * \b → word boundary * \B → not a word boundary #### Wildcards & groups * . → any single character except newline (unless re.S) * ( … ) → capturing group * (?: … ) → non-capturing group * (?P<name> … ) → named group * | → OR #### Functions | Function | What it does | Example |----------|----------|---------- | re.search(pattern, text) | Finds the first match anywhere in the string. Returns a Match object or None. | ```re.search(r"\d+", "abc123xyz").group() → "123" ``` | | re.match(pattern, text) | Like search, but only checks at the start of the string. | ```re.match(r"\w+", "hello world").group() → "hello"``` | | re.findall(pattern, text) | Returns a list of all matches. | ```re.findall(r"\d+", "a1 b22 c333") → ['1', '22', '333']``` | | re.finditer(pattern, text) | Returns an iterator of Match objects (useful for groups/named groups). | ```[m.group() for m in re.finditer(r"\d+", "a1 b22")] → ['1','22']``` | | re.sub(pattern, repl, text) | Replace all matches with repl. | ```re.sub(r"\d+", "#", "a1 b22") → "a# b#"``` | | re.split(pattern, text) | Split string by the pattern (like str.split but regex-aware). | ```re.split(r"\s+", "a b c") → ['a','b','c']``` | | re.compile(pattern, flags=0) | Precompile regex for reuse; attach methods like .search(), .findall(). | ```pat = re.compile(r"\d+"); pat.findall("a1 b22") → ['1','22']``` | #### Useful Match Object Methods * .group() → whole match or group by index/name. * .groups() → tuple of all capture groups. * .groupdict() → dict of named groups. * .start(), .end() → positions of match in string. #### Flags | Flag | Name | Modification | |------|------|------| | ```re.I``` | ignore casing | Makes the expression search case-insensitive | | ```re.G``` | global | Makes the expression search for all occurrences. | | ```re.S``` | dot all | Makes the wild character ```.``` match newlines as well. | | ```re.M``` | multiline | Makes the bounday characters ```^``` and ```$``` match the beginning and ending of every single line instead of the beginning and ending of the whole string. | | ```re.Y``` | sticky | Makes the expression start its searching from the index indicated in its ```lastIndex``` property. | | ```re.U``` | unicode | Makes the expression assume individual characters as code points, not code units, and thus match 32-bit characters as well. | ## Strings * ```text = text.lower()``` for lower case * ```text = text.upper()``` for upper case * ```text = " ".join()``` to concatenate items in an iterable into a single string * ```text.split()``` splits a string on whitespace (" ", \n) * ```text = " ".join(text.split())``` to collapse any run of whitespace down to a single space * ```text.strip()``` removes white space by default, but can specify a string/character to remove * ```text.replace(old, new)``` replaces occurrences of the old with the new * ```f"{var}"``` f-string syntax | Function | Returns True if… | Example | |------------------|----------------------------------------------------------|---------------------------------| | `str.isalpha()` | All characters are alphabetic (A–Z, a–z) | `"abc".isalpha()` → True | | `str.isdigit()` | All characters are digits (0–9) | `"123".isdigit()" → True | | `str.isnumeric()`| All characters are numeric (digits + fractions, etc.) | `"Ⅻ".isnumeric()` → True | | `str.isdecimal()`| All characters are decimal digits only | `"123".isdecimal()" → True | | `str.isalnum()` | All characters are alphanumeric (letters + digits) | `"abc123".isalnum()" → True | | `str.isspace()` | All characters are whitespace (spaces, tabs, newlines) | `" \n".isspace()" → True | | `str.islower()` | All cased characters are lowercase (and at least one) | `"hello".islower()" → True | | `str.isupper()` | All cased characters are uppercase (and at least one) | `"HELLO".isupper()" → True | | `str.istitle()` | String is titlecased (e.g., “Hello World”) | `"Hello World".istitle()" → True| | `str.isidentifier()` | String is a valid Python identifier | `"variable1".isidentifier()" → True | | `str.isprintable()` | All characters are printable or spaces | `"abc!".isprintable()" → True |