[Toc] ## data type 1. Numbers (integer / float / complex / boolean) 2. String 3. List 4. Tuple 5. Set 6. Dictionary > lists, sets and dictionaries are mutable > numbers, strings, and tuples are ++im++mutable ### string #### string operator | symbol | example, assume a = 'Python' | | ------ | -------------------------------- | | + | ```'Hi' + a``` = ```HiPython``` | | * | ```a * 2``` = ```PythonPython``` | | [] | ```a[2]``` = ```t``` | | [:] | ```a[1:4]``` = ```yth``` | | [::] | ```a[4:0:-1]``` = ```ohty``` | | in | ```P in a``` = true | | not in | ```m not in a``` = true | | r/R | ```r"\n\t"``` = ```\n\t``` | | % | ```"<%d>"%(2)``` = ```<2>``` | ==EXAMPLE== ```bash $ cat string.py x = 'Hello World!' print(x[0:]) print(x[:-1]) print(x[8:2:-1]) print(x[::-1]) print(x[::2]) print(x[::]) ``` RESULT: ```bash $ python3 string.py Hello World! Hello World roW ol !dlroW olleH HloWrd Hello World! ``` #### comparsion first: list / tuple / string / set * list: __a mutable ordered sequence__ of things. ```x = [1, 2.0, 'three', [4], 5j] ``` * tuple: an immutable ordered sequence of things. (essentially, __a read only list__) ```x = (1, 2.0, 'three', [4], 5j) ``` * string: an immtable ordered sequence of character. (essentially, __a tuple of character__) ```x = "1, 2.0, 'three', [4], 5j" ``` * set: a mutable __unordered__ sequence of __immutable things__. (essentially, __an unordered list__, such as a bag) ```x = {1, 2.0, 'three', 4, 5j} ``` _*note: the individual element of a set must be immutable, hence we could not put a list([4]) inside of a set._ * dictionary: a mutable __hash table__ with _immutable_ key. (essentially, a set with attached values) ```x = {1:2.0, 'three':[4], 5j:6}``` in interative mode: ```python >>> set('5.0') {'0', '5', '.'} >>> str(5.0) '5.0' >>> int(5.0) 5 >>> list('5.0') ['5', '.', '0'] >>> sorted('5.0') #by ASCII order ['.', '0', '5'] >>> str(_) "['.', '0', '5']" >>> tuple(_) ('[', "'", '.', "'", ',', ' ', "'", '0', "'", ',', ' ', "'", '5', "'", ']') >>> list(tuple(str(complex(5)))) ['(', '5', '+', '0', 'j', ')'] ``` ```python >>> list("hello") ['h', 'e', 'l', 'l', 'o'] >>> list("hello", "hello") Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: list expected at most 1 argument, got 2 >>> list(*"hello", *"hello") ['h', 'e', 'l', 'l', 'o', 'h', 'e', 'l', 'l', 'o'] >>> L1 = [1, 2, 3]; L2 = [4, 5, 6]; s = "abc" >>> [L1, L2, s]; [*L1, *L2, *s] [[1, 2, 3], [4, 5, 6], 'abc'] [1, 2, 3, 4, 5, 6, 'a', 'b', 'c'] ``` _*note: the ```*``` could __unpack__ the stuff._ #### data type conversion funciton ```python >>> #these return various kinds of numbers >>> complex(1, 12) (1+12j) >>> ord('A') 65 >>> ord('AB') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: ord() expected a character, but string of length 2 found >>> round(2.7182818, 2) 2.72 ``` ```python >>> #these all do what you expect them to do >>> tuple({2,( ),'@♞'}) ('@♞', 2, ()) >>> list({2,( ),'@♞'}) ['@♞', 2, ()] >>> sorted({2,( ),'@♞'}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: '<' not supported between instances of 'int' and 'str' ``` ```python >>> #these all return srtings >>> hex(9822) '0x265e' >>> oct(9822) '0o23136' >>> bin(9822) '0b10011001011110' >>> chr(9822) '♞' >>> ascii ([1,2,[1],'@♞']) #♞ is 265E in unicode "[1, 2, [1], '@\\u265e']" >>> format('@♞',"#>10s") # left pads a string '########@♞' >>> format('@♞',"^10s") # default pad is space ' @♞ ' ``` ==EXAMPLE== ```python >>> x = ','; a = 'abc' >>> print(list(a), end='\t\t'); print(a[0], a[1], a[2], sep = x) ('a', 'b', 'c') 1,2,3 ``` #### useful non-type-conversion functions * ```print()``` always returns a __srting__ * ```sorted(..., reverse=True)``` reverse by ASCII order * ```reversed()``` reverse by the order of original string ```python >>> L = [*"hello world"]; L ['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'] >>> sorted(L, reverse=True) ['w', 'r', 'o', 'o', 'l', 'l', 'l', 'h', 'e', 'd', ' '] >> [*reverse(L)] ['d', 'l', 'r', 'o', 'w', ' ', 'o', 'l', 'l', 'e', 'h'] >>> list(L[::-1]) ['d', 'l', 'r', 'o', 'w', ' ', 'o', 'l', 'l', 'e', 'h'] ``` * difference between ```sorted()``` and ```.sort()``` ```python >>> L = [*"hello world"] # a list >>> sorted(L) # would not change the object, L [' ', 'd', 'e', 'h', 'l', 'l', 'l', 'o', 'o', 'r', 'w'] >>> L ['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'] >>> >>> L.sort() # it would change the object >>> L [' ', 'd', 'e', 'h', 'l', 'l', 'l', 'o', 'o', 'r', 'w'] >>> >>> s = "hello world" # a string >>> s.sort() Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'str' object has no attribute 'sort' >>> sorted(s) # sorted create a list [' ', 'd', 'e', 'h', 'l', 'l', 'l', 'o', 'o', 'r', 'w'] ``` #### four ways to format strings 1. the format built-in function: ```python >>> format('@♞',"^10s") # default pad is space ' @♞ ' ``` 2. C-style formatting: ```python >>> "%dhr✕$%.2f/hr=$%.2f"%(8,10.5,8*10.5) '8hr✕$10.50/hr=$84.00' ``` 3. the format method of string: ```python >>> S="{hour:d}hr✕${rate:.2f}/hr=${pay:.2f}" >>> S.format(pay=8*10.5,hour=8,rate=10.5) '8hr✕$10.50/hr=$84.00' >>> "{x:d}{y:s}{x:.1f}{y:s}{x:%}".format(y=",",x=1) '1,1.0,100.000000%' >>> S="{:d}hr✕${:.2f}/hr=${:.2f}" >>> S.format(8,10.5,8*10.5) '8hr✕$10.50/hr=$84.00' ``` 4. formatted string literals: ```python >>> pay=8*10.5; hour=8.0; rate=10.5 >>> f"{int(hour)}hr✕${rate:.2f}/hr=${pay:.2f}" '8hr✕$10.50/hr=$84.00' ``` ### list similar to array in C, but the items can be different data types list is _mutable_, so we can assign to one value inside of a variable (not the same process as assigning to the entire variable) ```phyton! >>> L=[1,2,3,4,5,6,7,8,9] >>> L[2:6] [3, 4, 5, 6] >>> L[2:6]=['a', 'b', 'c', 'd'] >>> L [1, 2, 'a', 'b', 'c', 'd', 7, 8, 9] >>> L[2:6]=['x', 'y'] # with different length it's okay >>> L [1, 2, 'x', 'y', 7, 8, 9] >>> L[2:6]=[] >>> L [1, ,2, 9] >>> L[1:7:2]=[1, 2, 3, 4] # if using a step, sizes must match Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: attempt to assign sequence of size 4 to extended slice of size 3 ``` * ```del``` to delete the element in the list: __garbage collection__ _*note: ```del``` automatically happens when a variable is reassigned. so using ```del``` on a whol variable isn't so useful. however, using it on individual elements is quite common._ ```bash! $ cat listDelete2.py L = [ 'abcd', 786, 2.23, 'john', 70.2, 1, 2, 3, 4, 5, 6] print (L[2]); del L[2]; print (L[2]) # L[3] is now in the L[2] spot print (L) # See that L[3:] have now all shifted down del L[2:7:2]; print (L) # This deletes 3rd,5th,and 7th del L # This deletes the entire list print (L) # What prints if there is no list? $ python3 listDelete2.py 2.23 john ['abcd', 786, 'john', 70.2, 1, 2, 3, 4, 5, 6] ['abcd', 786, 70.2, 2, 4, 5, 6] Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'L' is not defined ``` In C, we manage our own garbage like ```free()``` ```c #include <stdio.h> #include <stdlib.h> typedef struct { int x,y,z; } coord; coord *A, *B, *C; void print(coord *P){ printf("%d,%d,%d\n", P->x,P->y,P->z); } int main() { A=(coord*)malloc(sizeof(coord)); A->x=1; B=A; C=B; A->y=2; C->z=3; A=NULL; B=NULL; print(C); C=NULL; /* in this way, we only point the pointer to NULL, * we do NOT free the memory of pointer, * therefore, the value store in x, y, z will stuck in memory forver. */ print(C); // segmentataion fault } ``` ```bash $ gcc -o test test.c $ ./test 1, 2, 3 zsh: segmentation fault ``` otherwise: ```c #include <stdio.h> #include <stdlib.h> typedef struct { int x,y,z; } coord; coord *A, *B, *C; void print(coord *P){ printf("%d,%d,%d\n", P->x,P->y,P->z); } int main() { A=(coord*)malloc(sizeof(coord)); A->x=1; B=A; C=B; A->y=2; C->z=3; A=NULL; print(C); free(B); C=NULL; /* in this way, B and C are dangling pointers * (they point to the same address) */ print(C); } ``` ```bash $ gcc -o test test.c $ ./test 1, 2, 3 the output is unpredictable ``` ### tuple can be thought as a __read-only__ list, which is immutable. strings can be thought of as tuples of characters. * creating singleton tuple: a tuple with one value since numerical expressions use parentheses too: ```python >>> x = (50); print(x * 2) 100 >>> # (50) * 2 == 100 tells us (50) is a number >>> x = (50,); print(x * 2) # use a comma at the end (50, 50) ``` * any set of _comma-separated_ objects, written without identifying symbols like enclosing [] or (), default to a tuple ![image](https://hackmd.io/_uploads/BJ7Cn5lyR.png) * You can’t remove individual tuple elements. But you can put together a new tuple, while choosing to leave out the undesired elements: ```python >>> tup = ('phys', 'chem', 2017, 2019) >>> tup = tup[:1]+tup[-1:] # can even reuse name >>> tup ('phys', 2019) >>> ``` ![image](https://hackmd.io/_uploads/Bko4yilJA.png) ### set defining a set like math python sets are __unordered__: one implication of this is that you ++cannot++ use order-based syntax, another implication is comparisons ++ignore++ order, another implication is elements ++don't repeat++. * python set elements must be ++immutable++, even though itself is ++mutable++ > L=[1,2]; S={[ ],L}; L.clear(); len(S) # Is len = 1 or 2? > L=[1,2]; S={(L,)} # This won’t work > | symbol | method | | ---------- | ------------------------ | | len(s) | | | x in s | | | x not in s | | | s == t | | | s != t | | | s <= t | s.issibset(t) | | s < t | | | s >= t | s.issuperset(t) | | s > t | | | s \| t | s.union(t) | | s & t | s.intersection(t) | | s - t | s.difference(t) | | s ^ t | s.symmetri_difference(t) | | | s.copy() | ### dictionary dictionaries are ++mutable++, having colons, :, inside of their curly braces, {}. colons seperate the keys from the values; therefore, they use key/value pairs which is ++__dynamic hash tables__++. Nevertheless, key must be __immutable__: string, number or tuple, and value have no restriction. e.g., ```student={'name':'Bob', 'age':20, 'major':'cse'}``` * ```dict.clear()``` * ```del(dict)``` ==EXAMPLE== ```python! >>> student={'CSE':[1,(((2,),),)]} >>> weird={(5,):student, 2+3j:"str"} # this student is a value >>> print ("Weird:", weird) Weird: {(5,): {'CSE': [1, (((2,),),)]}, (2+3j): 'str'} >>> works = "works" >>> weird[works] = 1 >>> weird[student] = 2 # key MUST BE immutable, while student is a dictionary, which is mutable Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'dict' ``` * which type is {}? ```python >>> print(type({})) <class 'dict'> ``` | empty objects | singleton objects | | ------------- | ----------------- | | ```L=[]``` | ```L=[1]``` | | ```T=()``` | ```T=(1,)``` | | ```S=""``` | ```S="1"``` | | ```D={}``` | ```D={1:1}``` | | ```S=set()``` | ```s={1}``` | * built-in functions and methods e.g., ```len(L)``` returns the length of the list ```max(L)```, ```max(T)```, ```min(S)``` ```min(L)```, ```min(T)```, ```min(S)``` ### the method of each data type ```python >>> dir(tuple)[-2:] ['count', 'index'] >>> dir(list)[-11:] ['append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] >>> dir(set)[-17:] ['add', 'clear', 'copy', 'difference', 'difference_update', 'discard', 'intersection', 'intersection_update', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove', 'symmetric_difference', 'symmetric_difference_update', 'union', 'update'] >>> dir(dict)[-11:] ['clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values'] >>> dir(str)[-45:] ['center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill'] ``` However, not every methods are important. There are some common methods we use frenquently. ```python >>> ImportantMethodsOf(tuple) ['count', 'index'] >>> ImportantMethodsOf(list) ['append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] >>> ImportantMethodsOf(set) ['clear', 'copy', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove'] >>> ImportantMethodsOf(dict) ['clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'values'] >>> ImportantMethodsOf(str) ['count', 'encode', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isnumeric', 'isupper', 'join', 'lower', 'replace', 'split', 'startswith', 'upper'] >>> ImportantMethodsOf(int) ['bit_length', 'from_bytes', 'to_bytes'] >>> ImportantMethodsOf(bool) [] >>> ImportantMethodsOf(float) ['is_integer'] >>> ImportantMethodsOf(complex) ['conjugate', 'imag', 'real'] ``` * ```list.append(obj)``` append object to list * ```list.extend(str)``` append the context of the sequence to list * ```list.pop([position])``` remove from the list and returns the last element(or the one at position) * ```list.remove(obj)``` remove __first occurance__ obj from list * ```str.format(argument)``` return the result of inserting arguments into the format str * ```map()``` apply a given function to each element ```python >>> from math import factorial, gamma >>> factorial(range(0, 10)) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'range' object cannot be interpreted as an integer >>> list(map(factorial,range(0,10))) [1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880] >>> gamma(range(1,11)) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: must be real number, not range >>> list(map(gamma,range(1,11))) [1.0, 1.0, 2.0, 6.0, 24.0, 120.0, 720.0, 5040.0, 40320.0, 362880.0] >>> list(map(int,map(gamma,range(1,11)))) [1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880] >>> ``` ==EXAMPLE== ```python >>> x.extend('abc') >>> x [1, 2, 3, 4, 'a', 'b', 'c'] >>> x.append('xyz') >>> x [1, 2, 3, 4, 'a', 'b', 'c', 'xyz'] ``` ## import standard function * ```from modname import f1, f2, s1``` we can access these object, ie: "s1" not "modname.s1" * ```from modname import *``` * ```import numpy as np``` * ```import module1``` user-defined module * ```import dir1.dir2...dirn.package``` * ```if __name__ == '__main__'``` An \_\_init\_\_.py is a file that runs when its packages is loaded. __\_\_pycache\_\___ a .pyc file is byte-compiled version of modules. Byte-code is platform independent and .pyc files are used to hide the .py source code. ```bash $ cat example.py print('hello world!') $ python3 Python 3.12.2 (main, Feb 6 2024, 20:19:44) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import example; exit() hello world! $ ls __pycache__ example.py $ cd __pycache__; ls example.cpython-312.pyc $ python3 example.cpython-312.pyc hello world! $ cat example.cpython-312.pyc ? ˏ?e???ed?y)z ???m?r% ``` _*note: a python program(.pyc) that is not a "script", which means that it is <font color=#f00>not meant to be viewed</font> by people._ ## def function the function's body __must be__ indented. ```python def function(parameter): """description of the function""" statement1 statement2 return value # unnecessary ``` If we define a function in interative mode, a final empty line is needed after the function body. It may look like below: ```python >>> def f(): ... print('hello world') ... >>> ``` Python takes the idea of object to its limit: function also an object ```python! >>> def f(x): ... print("the number:",x) ... >>> L = [3.0, f, 7] >>> L [3.0, <function f at 0x1009d2980>, 7] >>> f(5) the number: 5 >>> L[1] <function f at 0x1009d2980> >>> L[1](5) the number: 5 >>> # there is no different between a literally-typed invocation >>> # vs one invoked through a variable ``` _\_\_builtins\_\__ * ```callable()``` vs ```type()``` * comparing ```hash()``` & ```id()``` ```python >>> def f(): print("f() was called") ... >>> f() f() was called >>> f <function f at 0x1009d2840> >>> hex(id(f)) '0x1009d2840' >>> print(f, hex(id(f))) <function f at 0x1009d2840> 0x1009d2840 >>> print(hash.__doc__) Return the hash value for the given object. Two objects that compare equal must also have the same hash value, but the reverse is not necessarily true. >>> hash(0) 0 >>> hash("") # unlike id(), hashes aren't unique 0 >>> (hash(0) == hash("")) and (0 != "") True >>> hash(()) 5740354900026072187 >>> x=123456 >>> hash((x,)) 961584959793188436 >>> hash([x]) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list' >>> hash({x}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'set' ``` _*note: only __immutable__ types are hashable, and that dictionary and set use hash table._ * the ```is``` & ```is not``` operators: ```is``` is similar to ```==```, but ```is``` tests for ++whether the two objects are the same object++, rather than testing whether they have same value ![image](https://hackmd.io/_uploads/H1p50r-JC.png) ## control flow ### conditional ```python if expression1: statement1 elif expression2: statement2 else: statement3 ``` ### loop ```python while expression: statement ``` python supports an ```else``` after a loop, below are 2 inequivalent while loop ```python i = 1 while i < 5: if(A[i] == "the string i want"): break i+=1 print("always print this line when the loop is done") i = 1 while i < 5: if(A[i] == "the string i want"): break i+=1 else: print("no match was found") ``` * ```pass```: same as the isolated ```;``` in C python ```python while True: pass print("never print") ``` C ```c #include<stdio.h> int main(){ while(1); printf("never print\n"); } ```