Week - 2 FTMLE **Monday** List comprehension: > * Provides a concise way to create a list > * Always returns a result list > Instead of write lines: *newlist = [] for i in oldlist: if filter(i): newlist.append(expressions(i))* You can obtain the same behavior by using list comprehension: *newlist = [expression(i) for i in oldlist if filter(i)]* > * It can be used to create dictionary or set: *square_dict = {x: x * x for x in range(5)} # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16} square_set = {x * x for x in [1, -1]} # {1}* >* Doesn't require value inside: *zeros = [0 for _ in even_numbers] # has the same length as even_numbers* >* Or includes multil iteration: *pairs = [(x, y) for x in range(10) for y in range(10)] # 100 pairs (0,0) (0,1) ... (9,8), (9,9)* Iterables and Generators: Generator is a function returning an object(iterator) which we iterate over (one value at a time). *def generate_range(n): i = 0 while i < n: yield i # every call to yield produces a value of the generator i += 1* The difference between a list comprehension and a generator expression is that while list comprehension produces entire list, a **generator only produces one item at a time**. Because of that they are **more efficent in terms of memory**. Creating a " generator comprehension" has the same behavior as generator expression. *def natural_numbers(): """returns 1, 2, 3, ...""" n = 1 while True: yield n n += 1 data = natural_numbers() evens = (x for x in data if x % 2 == 0) even_squares = (x ** 2 for x in evens)* **It doesn't do any works until you iterate over it.** Automated Testing via assert: - Using to test your code: *assert 1 + 1 == 2 assert 1 + 1 == 3* Raising an error in the second line. Randomness: *import random random.seed(10) # this ensures we get the same results every time four_uniform_randoms = [random.random() for _ in range(4)] four_uniform_randoms* To keep the same result you need to use *seed*: *random.seed(10) # set the seed to 10 print(random.random()) random.seed(10) # reset the seed to 10 print(random.random()) # same result again* Zip and Argument Unpacking: Extract all files from zip files >< Compress all files to zip files Decorator: > * Adding functionality to an existing code. > * Can be treated like other datatypes(string, int, float, list, and so on) > * @my_decorator is just an easier way of saying : say_whee = my_decorator(say_whee) RegEx: Regular Expression is a sequence of characters that defines a search pattern. RegEx cheetsheet: ![](https://i.imgur.com/huW7GxW.png) --- **Tuesday** Object-Oriented Programming: - **Inheritance** A process of using details from a new class without modifying existing class. - **Encapsulation** Hiding the private details of a class from other objects. - **Polymorphism** A concept of using common operation in different ways for different data input. Class is a blueprint for the object. An object (instance) is an instantiation of a class. When class is defined, only the description for the object is defined. Therefore, no memory or storage is allocated. Ex: obj = Parrot() here obj is an object of class Parrot() Method is a function defined inside the body of class. **Key Points to Remember**: The programming gets easy and efficient. The class is sharable, so codes can be reused. The productivity of programmars increases Data is safe and secure with data abstraction. Recursion: - Is a process of defining in terms of itself. - It needs a base case to stop the recursion. - Makes code more clean and elegant. - A complex-task can be broken down into simple sub-problems. - Takes a lot of memory and time. - Harder to debug - The logic behind is hard to follow through Time and Space Complexity (BigO notation): - Quantifies the amount of time and space taken by an algorithm to run as a function of the length of the input. ![](https://i.imgur.com/cTEgOUh.png) --- **Wednesday** Numpy is a library has a high-performance multildimensional array. This is a numpy cheetsheet: [https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf](https://) --- **Thursday** Relational Database: - It uses a structure that allow us to indentify and access data in relation to another piece of data. Table, Record, Field, Row & Column: - A table is a set of data elements(values). - A piece of data in a file is called a record. - Each item in a record is called a field. - Primary key is a one or more fields that uniquely identifies a row in a table. Can not be null. It is indexed - Foreign key is a relationship between columns in two database tables (one of which is indexed) SQL: - SQL is a language maniupilating data from database. It can be broken down into three distinct groups: - DDL: create, drop, alter - DML: select, insert, update, delete - DCL: manages user access Logical operators: AND, OR, NOT LIKE: Find the matching in conditions. % matches any number of character -- matches one character Conditional operators: >, <, >=, <>, <=, = Aggregate functions: COUNT(*), SUM(*), AVG(*), MAX(*), MIN(*) HAVING: you can't use WHERE with aggregate functions instead of it you can you HAVING. EX: SELECT district, AVG(unit_price) FROM product GROUP BY district; HAVING AVG(unit_price)>=200; /*Filters result after being grouped.*/ Joining tables: - INNER JOIN return records that have matching values in both tables. - LEFT JOIN return all records from the left table and the mathced record from the right table. - RIGHT JOIN opposite LEFT JOIN - FULL JOIN return all records when there is a match in either left or right table. Crawling data from TIKI: - Create a database to store data crawling from the functions. - After having a database we need to define table inside this database by using function based on sqlite and python. So the data will store in database by sql command - Class category has 2 main mission. It's a place to keep data, also referred as class method to see these categories. Then, insert data into database by calling save_into_db. get url, parsing html by using requests library and beautifulsoup